July 7, 2018

Hello!

This is 1st part of the Scala and Data Science Series. The article in this series are meant for aspiring data scientist (like Me), who wish to learn full stack Big Data & Machine Learning pipeline. Generally, I work with python for data science, but to build a product pipeline with big data tools like kafka, zookeeper and spark, I felt learning scala will be a better option. Hence, I started my course with cognitive classes. This series will cover the code and resources from that course only, few other(which I will refer to learn particular topic).

Basics of Scala

Scala is

Statically typed language.Although, it can infer the data type of variable from the value of the variable, but You should declare the type of variable.
Modularity, you don’t have one global namespace for all the classes that are involved.
When compiled scala, it convert into the bytecode and be executed by JVM on various platforms.
proven correctness prior to deployment. optimal for large job.
Parallel processing ability
Light weight
Low boilerplate
Stable, scalable and innovative.
Mutable variable - var
Immutable variable - val

Why Scala for data science

Centrality and dispersion measures
ROC
feature engineering
Support Vector machine
Big data support and tools
Most big data tools are written in scala and have awesome support for scala APIs
Parallel processing

Website - http://scala-lang.org

Next blog will be about Creating a scala project.

To learn more about Scala, Stay Tuned.

Hope this helps! Keep tuned for more blogs from ML series.

Happy Learning!

Rajiv Jha :)

Rajiv Jha

My name is Rajiv Jha. I am Senior Engineering student at Guru Gobind Singh Indraprastha University.

Scala Tutorial - Part I

Basics of Scala

Rajiv Jha

Recent post

Scala Tutorial - Part I

Basics of Scala

Rajiv Jha

Recent post

Newsletter