Skip to main content

Data Science for Business by Foster Provost & Tom Fawcett O’Reilly Media

Data Science for Business is a book that makes a phenomenal job teaching the fundamental concepts of Data Science (a.k.a. Data Analysis and Data Mining). Foster Provost and Tom Fawcett explain in plain English, clear examples and beginner-level math the processes surrounding Data Science and the basics of its algorithms.


The authors go over the various steps of the CRISP method using situations found in the real world such as Customer Churn and Online Advertising. The most common data analysis models are reviewed and explained in detail such as Clustering, Decision Trees and Support Vector Machines. Extensive explanation is given to the difference between supervised and unsupervised methods. Even if you use software tools that create those models, this book will help you understand how to use/test them correctly and how to avoid over-fitting.


Multiple examples are given in each chapter and most of the math is visually aided with graphs. The authors explain step by step any equation presented in the book. A notable example is how the authors show how the different parts of the Bayes’ Rule equation come together in chapter 9. There are also special Math-intensive sections that business managers might skip, but software developers and future data scientist need to examine closely.

I would recommend this book to any DBA or Developer looking for an useful introduction to Data Science. For a practical application of the concepts in the book, I recommend Data Analysis Using SQL and Excel by Gordon Linoff after reading Data Science for Business. As a SQL Server DBA, I will apply the concepts I learned with the book to SQL Server Analysis Services.

Comments

Popular posts from this blog

Powershell script for converting JPG to TIFF

The following Powershell script will convert a batch of JPEG files to TIFF format: #This Code is released under MIT license [System.Reflection.Assembly]::LoadWithPartialName("System.Drawing") $files_folder = 'C:\path-where-your-jpg-files-are\' $pdfs = get-childitem $files_folder -recurse | where {$_.Extension -match "jpg"} foreach($pdf in $pdfs) { $picture = [System.Drawing.Bitmap]::FromFile( $pdf.FullName ) $tiff = $pdf.FullName.replace('.PDF','').replace('.pdf','').replace('.jpg','').replace('.JPG','') + '.tiff' $picture.Save($tiff) }

Power Automate: SFTP action "Test connection failed"

When I added an SFTP create file action to my Power Automate flow ( https://flow.microsoft.com ) , I got the following error in the action step, within the designer: "Test connection failed" To troubleshoot the Power Automate connection, I had to: go the Power Automate portal then "Data"->"Connections"  the sftp connection was there, I clicked on the ellipsis, and entered the connection info It turns out, that screen provides more details about the connection error. In my case, it was complaining that "SSH host key finger-print xxx format is not supported. It must be in 'MD5' format". I had provided the sha fingerprint that WinScp shows. Instead, I needed to use the MD5 version of the fingerprint. To get that, I had to run in command line (I was in a folder that had openssh in it): ssh -o FingerprintHash=md5 mysftpsite.com To get the fingerprint in MD5 format. I took the string (without the "MD5:" part of the string) and put

Alert if file missing using Powershell

The following Powershell script can be used to send an email alert when a file is missing from a folder or it is the same file from a previous check: $path_mask = "yourfile_*.txt" $previous_file_store = "lastfileread.txt" $script_name = "File Check" ###### Functions ########## Function EMailLog($subject, $message) {    $emailTo = "juanito@yourserver.com"    $emailFrom = "alert@yourserver.com"    $smtpserver="smtp.yourserver.com"       $smtp=new-object Net.Mail.SmtpClient($smtpServer)    $smtp.Send($emailFrom, $emailTo, $subject, $message) } Try {    #get files that match the mask    $curr_file = dir $path_mask |  select name    if ($curr_file.count -gt 0)    {        #file found        #check if the file is different from the previous file read        $previous_file = Get-Content $previous_file_store        $curr_file_name = $curr_file.Item(0).Name        if ($