Home|Products|Doc/XSTAR|Community|Company

FAQ|Demo|Download

 

    vnovalogo.jpg


xStar Bulk Data Loader.


 

Picture26.jpg

xSTAR Bulk Data Loader

xSTAR is VectorSTAR's grid-parallel bulk data loader and is specifically designed to take advantage of the unique opportunities presented by VectorSTAR's memory-mapped columnar architecture. xSTAR can be affordably configured to provide the highest data-loading performance  available today using only industry standard disk subsystems.

xSTAR and ETL

 

xSTAR is not a full-fledged, general-purpose ETL application. In a typical VectorSTAR installation, a dedicated ETL product will extract data from OLTP and other legacy LOB systems and produce CSV files, which are then scanned by xSTAR and transformed into the column binary data files that will be memory-mapped by the VectorSTAR DBMS. A single CSV input file with rows consisting of n fields, will be transformed into n different binary data files, one per column.

Other text formats (such as XML, JSON and NetCDF) are supported through addons to the core VectorSTAR product, but maximum loading speed is achieved using CSV-formatted text files.

Inmmediate availability

Once xSTAR has converted a file from CSV to binary format, the file's data is immediately available to any VectorSTAR DBMS with access to it (either through DAS, or across a SAN).

Independence of VectorSTAR engine

xSTAR loader processes can be running on any node on a network/grid without requiring VectorSTAR to be installed and running on that node. This enables the deployment of very scalable and cost-effective bulk loading systems.

Maximum loading peformance

When the highest attainable level of performance is required:

        xSTAR can read directly from a shared pipe connection established by the ETL application, so that no actual intermediate data text file needs to be created on disk

        program the ETL process to produce multiple single-column CSV files, instead of a single multi-column one.

Just-in-Time CSV Loader

Based on a description of the row structure of a CSV data file to be loaded, VectorSTAR's data loader produces the file's specific loader: a customized C program that is highly optimized for loading that specific row structure. Essentially, the loader consists of a loop that is executed once for every row in the CSV file. It is precompiled so that no function calls are made at all, and only macros and direct pointer references are used within the tightly bound code within the loop.

Binary File Structure

overlap.png