Showing posts from August, 2018

Spark, Dataframes, PostgreSQL

Spark is one of the most successful projects at Apache with one of the most popular skills for Big data engineers and a lot of companies look out for this specific skills while hiring.

Spark is a distributed computing software where it can employ multiple machines (cluster) which means you can scale horizontally (scale out) by adding more and more computers instead of having to buy/rent computers with higher CPU and Memory (scaling vertically/ scaling up ).

Setting up (Standalone Mode)
brew install apache-spark Run master : /usr/local/Cellar/apache-spark/2.3.1/bin/spark-class org.apache.spark.deploy.master.Master Run Slave(s) : /usr/local/Cellar/apache-spark/2.3.1/bin/spark-class org.apache.spark.deploy.worker.Worker  spark://:7077 -c 1 -m 512M  you will get the master url in the console output after running step 1 and you can run slaves either in another terminal or on another computer which is connected to the same network. Run example on master : /usr/local/Cellar/apache-spark…

Microsoft Azure - first impressions

So, after GCP and AWS it was my time to get my hands dirty with Azure. When I initially heard Microsoft is jumping onto the bandwagon of public cloud vendors,  it seemed like a wannabe. GCP itself had a lot of catch up to be done on with AWS. IBM and Oracle have been trying it for long but they have settled for a different genre of customers e.g. Oracle is primarily focusing only on deploying its own legacy software like e-business suite and modern on-premise software like Oracle Fusion into its Oracle Public Cloud instead of trying to get a market share from AWS or GCP. IBM has managed to get some real customers like EA but not quite there yet.
On the other hand, I think Microsoft Azure has come from behind and not only given some serious challenges to GCP and AWS but also capturing a major market share with its unique offerings in Machine Learning services and fantastic partnerships and support with corporations.

With whatever I have explored so far, I have mixed opinion about it. …