Skip to main content

core

infer_column_type

def infer_column_type(data: DataFrame, col: str) -> ColumnType

Get ColumnType for a given column of a dataframe.

Returns None for datetime columns if treat_datetime_as_numerical is True. Otherwise treats datatime as numerical.

Arguments:

  • data: dataset as a pandas dataframe
  • col: column to get type for

isnumber

def isnumber(*values: Union[str, float])

Check if all values are numeric strings (ints, floats etc)

apply_range_filter

def apply_range_filter(range_filter: str,
column: Series) -> Optional['Series[bool]']

Apply a 'range' filter on a column

Arguments:

  • range_filter: string containing ranges in format [-5.4, 6.5], [:, :] with : meaning (-)infinity
  • column: series data

apply_values_filter

def apply_values_filter(values_filter: str, column: Series,
col_type: ColumnType) -> Optional['Series[bool]']

Apply a 'values' filter on a column

Arguments:

  • range_filter: a range or comma-separated list of ranges
  • column: series data

parseISO

def parseISO(date: str) -> numpy.datetime64

Parse an ISO datestring to numpy.datetime64.

Deals with 'Z' being used instead of +0000 as a timezone which is common in JavaScript but the Python APIs doesn't accept that.

Arguments:

  • date: ISO datetime string created on the frontend

apply_date_filter

def apply_date_filter(
from_date: str, to_date: str,
column: 'Series[numpy.datetime64]') -> Optional['Series[bool]']

Apply a 'from_date' and 'to_date' filters to a column

Arguments:

  • from_date: ISO datestring representing the 'from' value
  • to_date: ISO datestring representing the 'to' value

apply_filters

def apply_filters(variable_filters: List[FilterInstance],
data: DataFrame) -> DataFrame

Apply filters on data

Arguments:

  • variable_filters: list of filters to apply
  • data: data to filter

get_filter_stats

def get_filter_stats(input_data: DataFrame, output_data: DataFrame,
filters: List[FilterInstance]) -> FilterStats

Get filter statistics

Arguments:

  • input_data: raw input data
  • output_date: filtered data
  • filters: list of filters applied