Tarastats Statistical Consultancy https://www.tarastats.com/ Wed, 05 Apr 2023 06:59:33 +0000 en-GB hourly 1 https://wordpress.org/?v=6.7.2 https://www.tarastats.com/wp-content/uploads/2018/04/icon-big.png Tarastats Statistical Consultancy https://www.tarastats.com/ 32 32 A New Approach To Statistical Thinking https://www.tarastats.com/new-approach-statistical-thinking/ Sat, 15 Oct 2022 12:00:27 +0000 https://www.tarastats.com/?p=7922 Statistical thinking is in need of a new approach due to recent developments of the modern statistical science. This new approach puts causal thinking at the heart of the key statistical thinking concepts, which reflects recent developments of modern statistical science in the field of causal inference theories, methodologies and approaches. This new approach is […]

The post A New Approach To Statistical Thinking appeared first on Tarastats Statistical Consultancy.

]]>

Statistical thinking is in need of a new approach due to recent developments of the modern statistical science.

This new approach puts causal thinking at the heart of the key statistical thinking concepts, which reflects recent developments of modern statistical science in the field of causal inference theories, methodologies and approaches.

This new approach is based on the key concepts of modern statistical science such as understanding of modern sampling theories, missing data mechanisms and their impact on usefulness of data interpretations obtained from Descriptive Statistics.

Furthermore, understanding of sampling theories and missing data mechanisms is critical for performing Inferential Statistics in a scientifically objective way.

Due to this new approach to statistical thinking being based on advances of modern statistical science we call it modern statistical thinking.

This article has not been released yet. Be the first to know when it comes out.

Leave your email below.

The post A New Approach To Statistical Thinking appeared first on Tarastats Statistical Consultancy.

]]>
Do you wonder how statistics lies? https://www.tarastats.com/how-statistics-lies/ Fri, 14 Oct 2022 20:02:27 +0000 https://www.tarastats.com/?p=7853 Statistics lies in presence of ignorance. By definition, ignorance means lack of knowledge, understanding, or information about something. Statistics lies when a person presenting statistical data lacks knowledge, understanding, or information about statistical-methodological techniques which enable one to analyse data in scientifically objective way. Although we live in an evidence-based world, our societies have not […]

The post Do you wonder how statistics lies? appeared first on Tarastats Statistical Consultancy.

]]>

Statistics lies in presence of ignorance.

ignorance

By definition, ignorance means lack of knowledge, understanding, or information about something.

Statistics lies when a person presenting statistical data lacks knowledge, understanding, or information about statistical-methodological techniques which enable one to analyse data in scientifically objective way.

Although we live in an evidence-based world, our societies have not done enough to enable citizens to be statistically literate.

In old times, being able to read and write enabled citizens to have more prosperous life. Today’s data world requires statistical literacy.

Does this mean that if presentation of statistics is done by those who have studied statistics we do not need to worry about statistics lying?

No. Even statisticians lack knowledge and understanding.

Why is it so?

The thing is that for a long time we have been ‘living’ with an idea of a linear world. Statistics has also been taught that way and at many places it is still taught this way.

What is new?

In early 20th century a theory in physics was developed called quantum mechanics. Quantum mechanics enables calculations of properties of physical systems as also behaviours of physical systems.

With quantum mechanics we learnt that it is all about cause and effect – an action (cause) manipulates behaviour of objects or subjects and the effect is the impact of the action (cause/manipulation) that we observe.

Place and time play a significant role in the causal-effect field.

If we connect this information to the fact that most questions of interest are causal in their nature, we can conclude that in order for statistics not to lie, those who present it are required to understand how causal relationships are analysed in a scientifically objective way.

Why they do not have such important knowledge?

The science of 20th century was not responsible only for development of quantum mechanics, but also for development of statistical methods that enable to analyse causal relationships in a scientifically objective way. In the second half of the 20th century Donald B. Rubin developed a causal model – broadly known as the Rubin Causal Model (Holland 1986) — which opened a door for statistics to be more reflective of the true reality. This causal model is a foundation for causal-effect studies in varieties of fields, from medicine and public health to economics, environment, biology, law and business.

Imagine, it has been only 35 years ago when important contributions were done in the modern statistical science. It takes time for all the researchers to catch-up on these developments and also to change curriculum of applied statistics courses accordingly.

How the curriculum of applied statistics education should change?

The most important thing is that the curriculum shifts from technical and theoretical details to modern statistical thinking and the use of important methods and techniques to derive data-insights in scientifically objective way.

The next most important thing is introduction of causality in statistics. Students should learn basics about causality in statistics, e.g., how statistical science defines the cause, what is a causal design and how to design an objective causal design.

Bottom line, students need to learn about the processes required to analyse causal relationships in a scientifically objective way. To be able to do so, they also need to learn about the key concepts of the
modern statistical thinking – the thinking required to analyse data in a scientifically objective way.

Is there a course that consists of such applied statistics curriculum?

Our founder Dr. Ana Kolar has been developing such curricula for the past 5 years. Some of her in-person courses can be attended at University of Helsinki. For those interested in online learning, you can find online version of courses here.

Please keep in mind that Ana uses experiential learning approach which enables deep learning. The online version of her courses are created in this spirit too. Ana believes that students should have an opportunity to deepen their knowledge when going through the learning process because this is the only way to knowledge. Students often rate her as an excellent teacher. She is fun too!

Can I learn about how to analyse causal relationships by studying books and articles?

Yes. There are plenty of books, articles and online lectures that one can use to learn about Causal Inference, but, without a proper guidance, it will take years to grasp the foundations of Causal Inference in its completeness.

It is not a secret that the field of analysing causal relationships is one of the most complex and difficult. It is a field of study that requires a heavy use of ‘thinking’ and a holistic approach in order to
resolve complexities. It is the use of modern statistical thinking that is more important here than elsewhere. It is the thinking that is based on the key developments of modern statistical science.

If you do not understand how to analyse causal relationships using the latest tools of modern statistical science, then your data analysis could lead to severe biases.

The post Do you wonder how statistics lies? appeared first on Tarastats Statistical Consultancy.

]]>
Data science without causal inference is like a fish without water https://www.tarastats.com/data-science-needs-causal-inference/ Sun, 15 May 2016 12:18:30 +0000 http://new.tarastats.com/?p=383 Learning about foundational concepts of causal inference is crucial for data science because most questions of interest are causal in their nature. Whether we perform impact evaluations, A/B testing, quality control or clinical trials, causal inference is the method of choice. Causal inference is one of the most complex data inference methods, but with high […]

The post Data science without causal inference is like a fish without water appeared first on Tarastats Statistical Consultancy.

]]>

Learning about foundational concepts of causal inference is crucial for data science because most questions of interest are causal in their nature.

Whether we perform impact evaluations, A/B testing, quality control or clinical trials, causal inference is the method of choice.

Causal inference is one of the most complex data inference methods, but with high rewards in terms of provided insights. In its essence, the theory and methods behind causal inference enable us to analyse causal relationships and thus fully unlock the value that data holds

Being familiar with causal inference methods and techniques also equips us with problem solving skills that are in crucial in order to analyse data in a scientifically objective way.

The main problem of causal inference is missing data and how to handle it. Because incomplete data is a typical problem in data science (often associated with biased results), it is important to become familiar with causal inference methods and techniques since such knowledge enhances our skillfulness for dealing effectively with incomplete data.


Understanding incomplete data is the path to handle missing data.

For a long time, causal inference was allowed to be performed only within a randomised  experimental framework. Due to recent developments in statistical-methodological science, we are now able to perform causal inference also with observational data, i.e., a non-randomised experimental data.

In recent years, causal inference has become one of the most popular methods in for analysing data. However, many still struggle with complexities of causal inference conceptual framework.

The conceptual framework of causal inference provides foundational knowledge about the required causal reasoning as also the use of modern statistical thinking to design studies and analyse data in a causal effect fashion. Such foundational knowledge is critical to be able to analyse causal relationships in a scientifically objective way.

Causal inference also provides us with understanding of the impact that study designs have on trustworthiness of obtained data insights as also on capacity to unlock the value that data holds.

Some examples of questions that causal inference can answer:

Not all causal questions can be answered!

For example, do black students perform better in education attainment than white or Hispanic students?

In this example, the race is considered to be the cause. However, because we cannot manipulate such a cause, meaning that there is no simple intervention with which we could transform a white person into a black, results of this study cannot be called causal effects, but rather associations which are conditional on a set of covariates used in comparative analysis.

Another example of a cause that cannot be manipulated is sex. We cannot give a magic pill to an individual and transform him/her into an opposite sex.

Within the randomised experimental framework we use intervention to manipulate units of one group in comparison. For example, we apply intervention to units of one group (usually called a treated group), while not applying it to units of another group, i.e., control group. The intervention is the known cause in the language of causal effect studies.

The known cause is the cause that can be manipulated. When we can define the known cause, we are able to use causal inference methods and techniques to perform causal effect studies also with observational data. However, we must make sure that when using observational data, we make all the effort to come up with two comparable groups, meaning, to have two approximately identical groups of units with respect to important covariates which can differ only with respect to applied intervention, i.e., the known cause.

Selection of covariates

A careful selection is of utmost importance, in order to be able to reconstruct observational data structure to mimic a data structure of a randomised experiment. Such reconstruction is a complex task, but in its essence it requires that we reconstruct an assignment mechanism of observational data to mimic an assignment mechanism of randomised experimental data.

What is Assignment Mechanism?

In a two group experimental randomised design, units are assigned to either Group 1 or Group 2, popularly called a treated and a control group. The mechanism which assigns units randomly is called an assignment mechanism. Because with observational data such assignment mechanism either does not exist or it is broken, it is important to reconstruct it in a way to mimic an assignment mechanism of the randomised experiment.

The process of reconstructing the assignment mechanism can be in many ways considered an art work. Yes, science requires art! However, this ‘art’ requires from us to be well-familiar with the necessary causal inference assumptions and ways to satisfy them.

Causal Inference without assumptions is mission impossible

There is a set of causal assumptions that are required to be satisfied with regards to study design and to be able to obtain trustworthy conclusions on causal effect estimates. Justifying causal assumptions is difficult. It requires creative thinking, modern statistical thinking and understanding about the science of causal thinking.

Understanding assumptions and how to justify them is of great importance when designing causal inference studies because effectiveness of causal designs depends on capacity to justify the required assumptions. The more effective the study design is, the better we can justify required assumptions. The importance of a good study design is of such that “Sometimes the design effort can be so extensive that a description of it, with no analyses of any outcome data, can be itself publishable” – Donald B. Rubin (2008) For Objective Causal Inference, design trumps analysis. The Annals of Applied Statistics.

To be able to design causal inference studies effectively, it is important to get familiar with conceptual foundations of causal inference. Causal inference is not an algorithm and neither an equation, but a methodological and analytical approach for analysing causal relationships that requires heavy use of ‘human-mind’ software. Learn more about causal inference’s foundations here.

The post Data science without causal inference is like a fish without water appeared first on Tarastats Statistical Consultancy.

]]>