Find me on Github.
manatee¶
“They aren’t quite pandas. Manatees, the pandas of the sea.”
Manatee is a wrapper class around PySpark DataFrames. It adds some much
needed user-friendliness by providing helper methods to the
pyspark.sql.dataframe.DataFrame
object. It also offers the ability
to pair the dataframe with a pyspark.mllib
classification or
regression model, neatly keeping everything in one place.
This project is in pre-alpha. Check out the documentation.
Contents¶
This project was last updated on 2016-06-17.
Thanks¶
Many thanks to Alexey Svyatkovskiy for his help getting me started with Spark and for his support with my numerous questions !