Is data science just glorified data analysis

Use of information theory in applied data science


Today I came across the book "Information Theory: An Introduction to a Tutorial" by James Stone and had a moment or two about that Extent of use of information theory in the applied Data science pondered (if you are not familiar with this somewhat fuzzy term, think about Data analysis , of which IMHO Data Science is a glorified version. I am making significant use of it Information theory based Approaches , Methods and activities , especially the entropy , under the hood aware of various statistical techniques and data analysis methods.

However, I am looking forward to that Scope / Level of knowledge a applied social scientist needed , around these concepts, measures and tools are successful to select and apply without delving too deeply into the mathematical origins of the theory. I look forward to your answers, which address my concerns in connection with the above book (or other similar books - you are welcome to recommend it) or in general.

I would also appreciate some recommendations for print or online sources that support the Information theory and their concepts, approaches, methods and measures in context (in the comparison to) others (rather) traditional statistical approaches ( Frequentist and Bayesian ) to discuss .






Reply:


The first part of the question: Data scientists need to know information theory ? I thought the answer was no until recently. The reason I changed my mind is because of a critical component: noise.

Many machine learning models (both stochastic and non-stochastic) use noise as part of their coding and transformation process. In many of these models, you need to infer the probability that the noise will affect after decoding the transformed output of the model. I think this is a central part of information theory. Not only is the KL divergence a very important measure in deep learning, which also comes from information theory.

Second part of the question: I think the best source is David MacKay's Algorithms for Information Theory, Inference, and Learning. He starts with information theory and incorporates these ideas into both inference and neural networks. The PDF is free on Dave's website and the lectures are online which is great




We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from.

By continuing, you consent to our use of cookies and other tracking technologies and affirm you're at least 16 years old or have consent from a parent or guardian.

You can read details in our Cookie policy and Privacy policy.