In this post I show you several approaches for using SQL Server in-database machine learning workloads (R / Python workloads) with pros and cons.
Run Spark jobs on Azure Batch – Azure AZTK
By using aztk, you can easily deploy and drop your Spark cluster in the cloud (Azure) and you can take agility for parallel programming (for ex, starting with low-capacity VMs, performance testing with large size or GPU accelerated, etc) with massive cloud computing power.
Here I show you our machine learning tutorials (PySpark and MLlib) with aztk.
Azure Batch AI – Walkthrough and How it works
In this post I show you Azure Batch AI fundamentals (how to use and how it works) using Azure CLI. You can find that Batch AI significantly simplifies your distributed training with Azure infrastructure.
Walkthrough – Distributed training with CNTK (Cognitive Toolkit)
The advantage of CNTK (Cognitive Toolkit) is not only performance, but it can also support multiple GPUs on multiple machines with built-in capabilities and rapidly scale along with the number of nodes and GPUs. Here I show you the step-by-step walkthrough for the distributed training with CNTK.
Mathematical Understanding of Overfitting in Machine Learning
Here we see some overfitting case with a lot of visual examples, and explain what you should care about and how to avoid. First we see with traditional statistical regression, and in the latter part we discuss about neural nets.
Mathematical Introduction to Regression (and Related Topics)

For the beginning of machine learning, here I’ll show you the fundamental idea of regression with several examples. In this post, I’ll focus on maximum likelihood estimation (MLE) for your beginning, and also show you its drawbacks and several alternatives later.
Introduce Time Series Analysis with ARIMA (Part 2)
In this post I describe the background and how-to for time-series analysis with more practical and advanced topics, non-stationary time-series (ARIMA) and seasonal time-series (Seasonal ARIMA), which is based on the basic idea (knowledge) in my previous post.
Through these posts (part1 and part2), you can shortly understand the outline for ARIMA time-series analysis.
Introduce Time Series Analysis with ARIMA (Part 1)
The time-series analysis in statistical learning is frequently needed in the practical system. Here I outline the time-series analysis with ARIMA model for developers building your intuitions.
Walkthrough of Azure Cosmos DB Graph (Gremlin)
In this post I show you a walkthrough (tutorials and general tasks) of Graph database with Azure Cosmos DB Gremlin for your first use, and a little bit dive into the practical usage of the graph query.
Analyze your data in Azure Data Lake with R (R extension)
When you run your R with data in Azure Data Lake, you don’t need to move or download your data. Here I show you how to use R extensions in Azure Data Lake along with the real scenarios.