public interface HiveInspectors
Decimal
     Array[Byte]
     java.sql.Date
     java.sql.Timestamp
  Complex Types =>
    Map: MapData
    List: ArrayData
    Struct: InternalRow
    Union: NOT SUPPORTED YET
  The Complex types plays as a container, which can hold arbitrary data types.
 
 In Hive, the native data types are various, in UDF/UDAF/UDTF, and associated with
 Object Inspectors, in Hive expression evaluation framework, the underlying data are
 Primitive Type
   Java Boxed Primitives:
       org.apache.hadoop.hive.common.type.HiveVarchar
       org.apache.hadoop.hive.common.type.HiveChar
       java.lang.String
       java.lang.Integer
       java.lang.Boolean
       java.lang.Float
       java.lang.Double
       java.lang.Long
       java.lang.Short
       java.lang.Byte
       org.apache.hadoop.hive.common.type.HiveDecimal
       byte[]
       java.sql.Date
       java.sql.Timestamp
   Writables:
       org.apache.hadoop.hive.serde2.io.HiveVarcharWritable
       org.apache.hadoop.hive.serde2.io.HiveCharWritable
       org.apache.hadoop.io.Text
       org.apache.hadoop.io.IntWritable
       org.apache.hadoop.hive.serde2.io.DoubleWritable
       org.apache.hadoop.io.BooleanWritable
       org.apache.hadoop.io.LongWritable
       org.apache.hadoop.io.FloatWritable
       org.apache.hadoop.hive.serde2.io.ShortWritable
       org.apache.hadoop.hive.serde2.io.ByteWritable
       org.apache.hadoop.io.BytesWritable
       org.apache.hadoop.hive.serde2.io.DateWritable
       org.apache.hadoop.hive.serde2.io.TimestampWritable
       org.apache.hadoop.hive.serde2.io.HiveDecimalWritable
 Complex Type
   List: Object[] / java.util.List
   Map: java.util.Map
   Struct: Object[] / java.util.List / java POJO
   Union: class StandardUnion { byte tag; Object object }
 
NOTICE: HiveVarchar/HiveChar is not supported by catalyst, it will be simply considered as String type.
2. Hive ObjectInspector is a group of flexible APIs to inspect value in different data representation, and developers can extend those API as needed, so technically, object inspector supports arbitrary data type in java.
Fortunately, only few built-in Hive Object Inspectors are used in generic udf/udaf/udtf evaluation. 1) Primitive Types (PrimitiveObjectInspector & its sub classes)
 public interface PrimitiveObjectInspector {
 // Java Primitives (java.lang.Integer, java.lang.String etc.)
 Object getPrimitiveJavaObject(Object o);
 // Writables (hadoop.io.IntWritable, hadoop.io.Text etc.)
 Object getPrimitiveWritableObject(Object o);
 // ObjectInspector only inspect the `writable` always return true, we need to check it
 // before invoking the methods above.
 boolean preferWritable();
 ...
 }
 
 
 2) Complex Types:
   ListObjectInspector: inspects java array or List
   MapObjectInspector: inspects Map
   Struct.StructObjectInspector: inspects java array, List and
                                 even a normal java object (POJO)
   UnionObjectInspector: (tag: Int, object data) (TODO: not supported by SparkSQL yet)
 
3) ConstantObjectInspector: Constant object inspector can be either primitive type or Complex type, and it bundles a constant value as its property, usually the value is created when the constant object inspector constructed.
 public interface ConstantObjectInspector extends ObjectInspector {
 Object getWritableConstantValue();
 ...
 }
 
 Hive provides 3 built-in constant object inspectors:
 Primitive Object Inspectors:
     WritableConstantStringObjectInspector
     WritableConstantHiveVarcharObjectInspector
     WritableConstantHiveCharObjectInspector
     WritableConstantHiveDecimalObjectInspector
     WritableConstantTimestampObjectInspector
     WritableConstantIntObjectInspector
     WritableConstantDoubleObjectInspector
     WritableConstantBooleanObjectInspector
     WritableConstantLongObjectInspector
     WritableConstantFloatObjectInspector
     WritableConstantShortObjectInspector
     WritableConstantByteObjectInspector
     WritableConstantBinaryObjectInspector
     WritableConstantDateObjectInspector
 Map Object Inspector:
     StandardConstantMapObjectInspector
 List Object Inspector:
     StandardConstantListObjectInspector}
 Struct Object Inspector: Hive doesn't provide the built-in constant object inspector for Struct
 Union Object Inspector: Hive doesn't provide the built-in constant object inspector for Union
 3. This trait facilitates: Data Unwrapping: Hive Data => Catalyst Data (unwrap) Data Wrapping: Catalyst Data => Hive Data (wrap) Binding the Object Inspector for Catalyst Data (toInspector) Retrieving the Catalyst Data Type from Object Inspector (inspectorToDataType)
4. Future Improvement (TODO) This implementation is quite ugly and inefficient: a. Pattern matching in runtime b. Small objects creation in catalyst data => writable c. Unnecessary unwrap / wrap for nested UDF invoking: e.g. date_add(printf("%s-%s-%s", a,b,c), 3) We don't need to unwrap the data for printf and wrap it again and passes in data_add
| Modifier and Type | Interface and Description | 
|---|---|
static class  | 
HiveInspectors.typeInfoConversions  | 
| Modifier and Type | Method and Description | 
|---|---|
DecimalType | 
decimalTypeInfoToCatalyst(org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector inspector)  | 
org.apache.hadoop.io.BytesWritable | 
getBinaryWritable(Object value)  | 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector | 
getBinaryWritableConstantObjectInspector(Object value)  | 
org.apache.hadoop.io.BooleanWritable | 
getBooleanWritable(Object value)  | 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector | 
getBooleanWritableConstantObjectInspector(Object value)  | 
org.apache.hadoop.hive.serde2.io.ByteWritable | 
getByteWritable(Object value)  | 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector | 
getByteWritableConstantObjectInspector(Object value)  | 
org.apache.hadoop.hive.serde2.io.DateWritable | 
getDateWritable(Object value)  | 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector | 
getDateWritableConstantObjectInspector(Object value)  | 
org.apache.hadoop.hive.serde2.io.HiveDecimalWritable | 
getDecimalWritable(Object value)  | 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector | 
getDecimalWritableConstantObjectInspector(Object value)  | 
org.apache.hadoop.hive.serde2.io.DoubleWritable | 
getDoubleWritable(Object value)  | 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector | 
getDoubleWritableConstantObjectInspector(Object value)  | 
org.apache.hadoop.io.FloatWritable | 
getFloatWritable(Object value)  | 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector | 
getFloatWritableConstantObjectInspector(Object value)  | 
org.apache.hadoop.io.IntWritable | 
getIntWritable(Object value)  | 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector | 
getIntWritableConstantObjectInspector(Object value)  | 
org.apache.hadoop.io.LongWritable | 
getLongWritable(Object value)  | 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector | 
getLongWritableConstantObjectInspector(Object value)  | 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector | 
getPrimitiveNullWritableConstantObjectInspector()  | 
org.apache.hadoop.hive.serde2.io.ShortWritable | 
getShortWritable(Object value)  | 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector | 
getShortWritableConstantObjectInspector(Object value)  | 
org.apache.hadoop.io.Text | 
getStringWritable(Object value)  | 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector | 
getStringWritableConstantObjectInspector(Object value)  | 
org.apache.hadoop.hive.serde2.io.TimestampWritable | 
getTimestampWritable(Object value)  | 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector | 
getTimestampWritableConstantObjectInspector(Object value)  | 
DataType | 
inspectorToDataType(org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector inspector)  | 
boolean | 
isSubClassOf(java.lang.reflect.Type t,
            Class<?> parent)  | 
DataType | 
javaTypeToDataType(java.lang.reflect.Type clz)  | 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector | 
toInspector(DataType dataType)  | 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector | 
toInspector(org.apache.spark.sql.catalyst.expressions.Expression expr)
Map the catalyst expression to ObjectInspector, however,
 if the expression is  
Literal or foldable, a constant writable object inspector returns;
 Otherwise, we always get the object inspector according to its data type(in catalyst) | 
scala.Function1<Object,Object> | 
unwrapperFor(org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector objectInspector)
Builds unwrappers ahead of time according to object inspector
 types to avoid pattern matching and branching costs per row. 
 | 
scala.Function3<Object,org.apache.spark.sql.catalyst.InternalRow,Object,scala.runtime.BoxedUnit> | 
unwrapperFor(org.apache.hadoop.hive.serde2.objectinspector.StructField field)
Builds unwrappers ahead of time according to object inspector
 types to avoid pattern matching and branching costs per row. 
 | 
scala.Function1<Object,Object> | 
withNullSafe(scala.Function1<Object,Object> f)  | 
Object[] | 
wrap(org.apache.spark.sql.catalyst.InternalRow row,
    scala.Function1<Object,Object>[] wrappers,
    Object[] cache,
    DataType[] dataTypes)  | 
Object | 
wrap(Object a,
    org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector oi,
    DataType dataType)  | 
Object[] | 
wrap(scala.collection.Seq<Object> row,
    scala.Function1<Object,Object>[] wrappers,
    Object[] cache,
    DataType[] dataTypes)  | 
scala.Function1<Object,Object> | 
wrapperFor(org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector oi,
          DataType dataType)
Wraps with Hive types based on object inspector. 
 | 
DecimalType decimalTypeInfoToCatalyst(org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector inspector)
org.apache.hadoop.io.BytesWritable getBinaryWritable(Object value)
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector getBinaryWritableConstantObjectInspector(Object value)
org.apache.hadoop.io.BooleanWritable getBooleanWritable(Object value)
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector getBooleanWritableConstantObjectInspector(Object value)
org.apache.hadoop.hive.serde2.io.ByteWritable getByteWritable(Object value)
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector getByteWritableConstantObjectInspector(Object value)
org.apache.hadoop.hive.serde2.io.DateWritable getDateWritable(Object value)
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector getDateWritableConstantObjectInspector(Object value)
org.apache.hadoop.hive.serde2.io.HiveDecimalWritable getDecimalWritable(Object value)
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector getDecimalWritableConstantObjectInspector(Object value)
org.apache.hadoop.hive.serde2.io.DoubleWritable getDoubleWritable(Object value)
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector getDoubleWritableConstantObjectInspector(Object value)
org.apache.hadoop.io.FloatWritable getFloatWritable(Object value)
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector getFloatWritableConstantObjectInspector(Object value)
org.apache.hadoop.io.IntWritable getIntWritable(Object value)
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector getIntWritableConstantObjectInspector(Object value)
org.apache.hadoop.io.LongWritable getLongWritable(Object value)
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector getLongWritableConstantObjectInspector(Object value)
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector getPrimitiveNullWritableConstantObjectInspector()
org.apache.hadoop.hive.serde2.io.ShortWritable getShortWritable(Object value)
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector getShortWritableConstantObjectInspector(Object value)
org.apache.hadoop.io.Text getStringWritable(Object value)
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector getStringWritableConstantObjectInspector(Object value)
org.apache.hadoop.hive.serde2.io.TimestampWritable getTimestampWritable(Object value)
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector getTimestampWritableConstantObjectInspector(Object value)
DataType inspectorToDataType(org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector inspector)
boolean isSubClassOf(java.lang.reflect.Type t,
                     Class<?> parent)
DataType javaTypeToDataType(java.lang.reflect.Type clz)
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector toInspector(DataType dataType)
dataType - Catalyst data typeorg.apache.hadoop.hive.serde2.objectinspector.ObjectInspector toInspector(org.apache.spark.sql.catalyst.expressions.Expression expr)
Literal or foldable, a constant writable object inspector returns;
 Otherwise, we always get the object inspector according to its data type(in catalyst)expr - Catalyst expression to be mappedscala.Function1<Object,Object> unwrapperFor(org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector objectInspector)
 Strictly follows the following order in unwrapping (constant OI has the higher priority):
 Constant Null object inspector =>
   return null
 Constant object inspector =>
   extract the value from constant object inspector
 If object inspector prefers writable =>
   extract writable from data and then get the catalyst type from the writable
 Extract the java object directly from the object inspector
 
NOTICE: the complex data type requires recursive unwrapping.
objectInspector - the ObjectInspector used to create an unwrapper.scala.Function3<Object,org.apache.spark.sql.catalyst.InternalRow,Object,scala.runtime.BoxedUnit> unwrapperFor(org.apache.hadoop.hive.serde2.objectinspector.StructField field)
field - The HiveStructField to create an unwrapper for.scala.Function1<Object,Object> withNullSafe(scala.Function1<Object,Object> f)
Object wrap(Object a,
            org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector oi,
            DataType dataType)
Object[] wrap(org.apache.spark.sql.catalyst.InternalRow row,
              scala.Function1<Object,Object>[] wrappers,
              Object[] cache,
              DataType[] dataTypes)
Object[] wrap(scala.collection.Seq<Object> row,
              scala.Function1<Object,Object>[] wrappers,
              Object[] cache,
              DataType[] dataTypes)
scala.Function1<Object,Object> wrapperFor(org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector oi,
                                          DataType dataType)
oi - (undocumented)dataType - (undocumented)