Here we are building the functionality to populate the resulting RDD[Row] Here is where we will do the following: - Filter push down - Scan or GetList pruning - Executing our scan(s) or/and GetList to generate result
Here we are building the functionality to populate the resulting RDD[Row] Here is where we will do the following: - Filter push down - Scan or GetList pruning - Executing our scan(s) or/and GetList to generate result
The columns that are being requested by the requesting query
The filters that are being applied by the requesting query
RDD will all the results from HBase needed for SparkSQL to execute the query on
Takes a HBase Row object and parses all of the fields from it.
Takes a HBase Row object and parses all of the fields from it. This is independent of which fields were requested from the key Because we have all the data it's less complex to parse everything.
the retrieved row from hbase.
all of the fields in the row key, ORDERED by their order in the row key.
Generates a Spark SQL schema objeparametersct so Spark SQL knows what is being provided by this BaseRelation
Generates a Spark SQL schema objeparametersct so Spark SQL knows what is being provided by this BaseRelation
schema generated from the SCHEMA_COLUMNS_MAPPING_KEY value
SparkSQL context
SparkSQL context
For some codec, the order may be inconsistent between java primitive type and its byte array.
For some codec, the order may be inconsistent between java primitive type and its byte array. We may have to split the predicates on some of the java primitive type into multiple predicates. The encoder will take care of it and returning the concrete ranges.
For example in naive codec, some of the java primitive types have to be split into multiple predicates, and union these predicates together to make the predicates be performed correctly. For example, if we have "COLUMN < 2", we will transform it into "0 <= COLUMN < 2 OR Integer.MIN_VALUE <= COLUMN <= -1"
Implementation of Spark BaseRelation that will build up our scan logic , do the scan pruning, filter push down, and value conversions
SparkSQL context