A simple abstraction over the HBaseContext.
A simple abstraction over the HBaseContext.foreachPartition method.
It allow addition support for a user to take a JavaRDD and generate delete and send them to HBase.
The complexity of managing the Connection is removed from the developer
Original JavaRDD with data to iterate over
The name of the table to delete from
Function to convert a value in the JavaRDD to a HBase Deletes
The number of deletes to batch before sending to HBase
A simple abstraction over the HBaseContext.
A simple abstraction over the HBaseContext.mapPartition method.
It allow addition support for a user to take a JavaRDD and generates a new RDD based on Gets and the results they bring back from HBase
The name of the table to get from
batch size of how many gets to retrieve in a single fetch
Original JavaRDD with data to iterate over
Function to convert a value in the JavaRDD to a HBase Get
This will convert the HBase Result object to what ever the user wants to put in the resulting JavaRDD
New JavaRDD that is created by the Get to HBase
A simple abstraction over the HBaseContext.
A simple abstraction over the HBaseContext.bulkLoad method. It allow addition support for a user to take a JavaRDD and convert into new JavaRDD[Pair] based on MapFunction, and HFiles will be generated in stagingDir for bulk load
The javaRDD we are bulk loading from
The HBase table we are loading into
A Function that will convert a value in JavaRDD to Pair(KeyFamilyQualifier, Array[Byte])
The location on the FileSystem to bulk load into
Options that will define how the HFile for a column family is written
Compaction excluded for the HFiles
Max size for the HFiles before they roll
A simple abstraction over the HBaseContext.
A simple abstraction over the HBaseContext.bulkLoadThinRows method. It allow addition support for a user to take a JavaRDD and convert into new JavaRDD[Pair] based on MapFunction, and HFiles will be generated in stagingDir for bulk load
The javaRDD we are bulk loading from
The HBase table we are loading into
A Function that will convert a value in JavaRDD to Pair(ByteArrayWrapper, FamiliesQualifiersValues)
The location on the FileSystem to bulk load into
Options that will define how the HFile for a column family is written
Compaction excluded for the HFiles
Max size for the HFiles before they roll
A simple abstraction over the HBaseContext.
A simple abstraction over the HBaseContext.foreachPartition method.
It allow addition support for a user to take JavaRDD and generate puts and send them to HBase. The complexity of managing the Connection is removed from the developer
Original JavaRDD with data to iterate over
The name of the table to put into
Function to convert a value in the JavaRDD to a HBase Put
A simple enrichment of the traditional Spark Streaming dStream foreach This function differs from the original in that it offers the developer access to a already connected Connection object
A simple enrichment of the traditional Spark Streaming dStream foreach This function differs from the original in that it offers the developer access to a already connected Connection object
Note: Do not close the Connection object. All Connection management is handled outside this method
Original DStream with data to iterate over
Function to be given a iterator to iterate through the JavaDStream values and a Connection object to interact with HBase
A simple enrichment of the traditional Spark javaRdd foreachPartition.
A simple enrichment of the traditional Spark javaRdd foreachPartition. This function differs from the original in that it offers the developer access to a already connected Connection object
Note: Do not close the Connection object. All Connection management is handled outside this method
Original javaRdd with data to iterate over
Function to be given a iterator to iterate through the RDD values and a Connection object to interact with HBase
A overloaded version of HBaseContext hbaseRDD that define the type of the resulting JavaRDD
A overloaded version of HBaseContext hbaseRDD that define the type of the resulting JavaRDD
The name of the table to scan
The HBase scan object to use to read data from HBase
New JavaRDD with results from scan
This function will use the native HBase TableInputFormat with the given scan object to generate a new JavaRDD
This function will use the native HBase TableInputFormat with the given scan object to generate a new JavaRDD
The name of the table to scan
The HBase scan object to use to read data from HBase
Function to convert a Result object from HBase into What the user wants in the final generated JavaRDD
New JavaRDD with results from scan
A simple enrichment of the traditional Spark JavaRDD mapPartition.
A simple enrichment of the traditional Spark JavaRDD mapPartition. This function differs from the original in that it offers the developer access to a already connected Connection object
Note: Do not close the Connection object. All Connection management is handled outside this method
Note: Make sure to partition correctly to avoid memory issue when getting data from HBase
Original JavaRdd with data to iterate over
Function to be given a iterator to iterate through the RDD values and a Connection object to interact with HBase
Returns a new RDD generated by the user definition function just like normal mapPartition
A simple abstraction over the HBaseContext.
A simple abstraction over the HBaseContext.streamBulkMutation method.
It allow addition support for a user to take a JavaDStream and generate Delete and send them to HBase.
The complexity of managing the Connection is removed from the developer
Original DStream with data to iterate over
The name of the table to delete from
Function to convert a value in the JavaDStream to a HBase Delete
The number of deletes to be sent at once
A simple abstraction over the HBaseContext.
A simple abstraction over the HBaseContext.streamMap method.
It allow addition support for a user to take a DStream and generates a new DStream based on Gets and the results they bring back from HBase
The name of the table to get from
The number of gets to be batched together
Original DStream with data to iterate over
Function to convert a value in the JavaDStream to a HBase Get
This will convert the HBase Result object to what ever the user wants to put in the resulting JavaDStream
New JavaDStream that is created by the Get to HBase
A simple abstraction over the HBaseContext.
A simple abstraction over the HBaseContext.streamMapPartition method.
It allow addition support for a user to take a JavaDStream and generate puts and send them to HBase.
The complexity of managing the Connection is removed from the developer
Original DStream with data to iterate over
The name of the table to put into
Function to convert a value in the JavaDStream to a HBase Put
A simple enrichment of the traditional Spark Streaming JavaDStream mapPartition.
A simple enrichment of the traditional Spark Streaming JavaDStream mapPartition.
This function differs from the original in that it offers the developer access to a already connected Connection object
Note: Do not close the Connection object. All Connection management is handled outside this method
Note: Make sure to partition correctly to avoid memory issue when getting data from HBase
Original JavaDStream with data to iterate over
Function to be given a iterator to iterate through the JavaDStream values and a Connection object to interact with HBase
Returns a new JavaDStream generated by the user definition function just like normal mapPartition
This is the Java Wrapper over HBaseContext which is written in Scala. This class will be used by developers that want to work with Spark or Spark Streaming in Java