9/3/2023 0 Comments Redshift spectrum parquet“If the same data was in columnar format such as Parquet, then it only scans the relevant column from the files,” explained Saket Saurabh, CEO & Co-Founder of Nexla. It bills the user for the full 1TB of data- even though only a fraction of data was relevant to computing the result. For example, if a query runs across 1TB of CSV files and performs a sum on one of the 20 columns, it scans all the files. Both Athena and Redshift Spectrum pricing depends on the amount of data scanned for executing a query. While these technologies support multiple file formats, using Parquet has a significant cost and performance benefit. This has important implications for companies leveraging Amazon Athena or Redshift Spectrum, which both allow running queries right out of files on S3. This allows Nexla users to immediately start querying and gaining insights on data with Amazon Athena and Redshift Spectrum- without data engineering effort. Nexla’s revolutionary platform can connect to almost any data source containing CSV, XML, EDI, Avro, JSON or any arbitrary text-delimited data, transform it using an intuitive point and click interface, and convert it to Parquet. With this release, companies can now convert nearly any data into Parquet for highly optimized, cost-effective queries in Amazon Athena and Redshift Spectrum. Nexla is happy to announce support for Apache Parquet, a free and open-source column-oriented data store.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |