Modeling the relationship between a scalar variable y and one or more variables denoted X. In linear regression, models of the unknown parameters are estimated from the data using linear functions.

polyfit( x,y2,1) %return 2.1667 -1.3333, i.e 2.1667x-1.3333

The null hypothesis (denote by H0 ) is a statement about the value of

a population parameter (such as mean), and it must contain the condition of equality and must be written with the symbol =, ≤, or ≤.

**3. Explain central limit theorem?**

As the sample size increases, the sampling distribution of sample

means approaches a normal distribution

If all possible random samples of size n are selected from a population with mean μ and standard deviation σ, the mean of the sample means is denoted by μ x̄ , so

μ x̄ = μ

the standard deviation of the sample means is:

σ x̄ = σ⁄√ n

A hash table is a data structure used to implement an associative array, a structure that can map keys to values. A hash table uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.

**5. Do you know what is binary search?**

For binary search, the array should be arranged in ascending or descending order. In each step, the algorithm compares the search key value with the key value of the middle element of the array. If the keys match, then a matching element has been found and its index, or position, is returned. Otherwise, if the search key is less than the middle element's key, then the algorithm repeats its action on the sub-array to the left of the middle element or, if the search key is greater, on the sub-array to the right.

**6. What is binomial probability formula?**

P(x)= p x q n-x n!/[(n-x)!x!]

where n = number of trials

x = number of successes among n trials

p = probability of success in any one trial

q = 1 -p

**7. Give example of Central Limit Theorem?**

Given that the population of men has normally distributed weights, with a mean of 173 lb and a standard deviation of 30 lb, find the probability that

a. if 1 man is randomly selected, his weight is greater than 180 lb.

b. if 36 different men are randomly selected, their mean weight is greater that 180 lb.

Solution: a) z = (x - μ)/ σ = (180-173)/30 = 0.23

For normal distribution P(Z>0.23) = 0.4090

b) σ x̄ = σ/√n = 20/√ 36 = 5

z= (180-173)/5 = 1.40

P(Z>1.4) = 0.0808

**8. What is significance level?**

The probability of rejecting the null hypothesis when it is called

the significance level α , and very common choices are

α = 0.05 and α = 0.01

**9. What is alternative hypothesis?**

The Alternative hypothesis (denoted by H1 ) is the statement that must be true if the null hypothesis is false.

**10. What is one sample t-test?**

T-test is any statistical hypothesis test in which the test statistic follows a Student's t distribution if the null hypothesis is supported.

[h,p,ci] = ttest(y2,0)% return 1 0.0018 ci =2.6280 7.0863

Measure of how much two variables change together

y2=[1 3 4 5 6 7 8]

cov(x,y2) %return 2*2 matrix, diagonal represents variance

Quantitative measure of the shape of a set of points.

moment(x, 2); %return second moment

Kurtosis is a measure of how outlier-prone a distribution is.

kurtosis(x) % return2.3594

Describes how far values lie from the mean

var(x) %return 1.1429

Skewness is a measure of the asymmetry of the data around the sample mean. If skewness is negative, the data are spread out more to the left of the mean than to the right. If skewness is positive, the data are spread out more to the right.

Skewness(x) % return-0.5954

► first quartile (25th percentile)

► second quartile (50th percentile)

► third quartile (75th percentile)

► kth percentile

► prctile(x, 25) % 25th percentile, return 2.25

► prctile(x, 50) % 50th percentile, return 3, i.e. median

Median is described as the numeric value separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest value and picking the middle one

median(x) % return 3.

The mode of a data sample is the element that occurs most often in the collection.

x=[1 2 3 3 3 4 4]

mode(x) % return 3, happen most

**19. What are sampling methods?**

There are four sampling methods:

► Simple Random (purely random),

► Systematic( every kth member of population),

► Cluster (population divided into groups or clusters)

► Stratified (divided by exclusive groups or strata, sample from each group) samplings.

Sampling is that part of statistical practice concerned with the selection of an unbiased or random subset of individual observations within a population of individuals intended to yield some knowledge about the population of concern.

**21. Give an example of p-value?**

Suppose that the experimental results show the coin turning up heads 14 times out of 20 total flips

► null hypothesis (H0): fair coin;

► observation O: 14 heads out of 20 flips; and

► p-value of observation O given H0 = Prob(≥ 14 heads or ≥ 14 tails) = 0.115.

The calculated p-value exceeds 0.05, so the observation is consistent with the null hypothesis - that the observed result of 14 heads out of 20 flips can be ascribed to chance alone - as it falls within the range of what would happen 95% of the time were this in fact the case. In our example, we fail to reject the null hypothesis at the 5% level. Although the coin did not fall evenly, the deviation from expected outcome is small enough to be reported as being "not statistically significant at the 5% level".

In statistical significance testing, the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. If the p-value is less than 0.05 or 0.01, corresponding respectively to a 5% or 1% chance of rejecting the null hypothesis when it is true

The probability of some observed outcomes given a set of parameter values is regarded as the likelihood of the set of parameter values given the observed outcomes.

Frequentists condition on a hypothesis of choice and consider the probability distribution on the data, whether observed or not.

Bayesians condition on the data actually observed and consider the probability distribution on the hypotheses.

**26. When you are creating a statistical model how do you prevent over-fitting?**

cross-validation

**27. Statistics Job Interview Preparation Questions Part Five!**

What is meant by human security?

What is Max Weber theory?

What does the theory describe?

What is the term social stratification defines?

How egotistic suicide helps the society in getting rid of the people who are not willing to live?

How many relationship that exist within a culture?

What are the problems faced by people due to unemployment?

What is the theory of dual burden?

What are the effective measures taken towards racial discrimination?

What are the indicators used to show the social development?

**28. Statistics Job Interview Preparation Questions Part Four!**

How psychology different from sociology?

What are the traits involved in social reforms?

What are the different components that are required to create a society's culture?

What is the difference between social change and development?

What are the different stereotypes used to define group relations?

What are the laws required by civilization?

Define anticipatory socialization?

What is the “conflict theory” in sociology?

How does art and design puts an affect on different cultures?

What are the various branches that exist in sociology?

**29. Statistics Job Interview Preparation Questions Part Three!**

What are the different types of deviance that exist?

What is the difference between tertiary and secondary deviance?

What are the different areas of sociology?

What is the difference between urban and rural community?

How racism can be abolished in the society?

How in different ways patriotism can be shown?

What are the different types of agents present in socialization?

What is the meaning of incest?

What is the function of incest?

How cultural diversity can be reduced around different culture?

**30. Statistics Job Interview Preparation Questions Part Two!**

What are the different types of story that is defined in sociology?

What are the main functions of formalism?

How to manage the problems occurring in contemporary culture?

What is hegemony?

What is the difference between adaptive and real culture?

What is the theory of Non Symbolic interactionism?

What are the different agencies of socializations?

What is the difference between appropriate and inappropriate behavior?

What are the different principles involved in natural science?

What are cultural traits?

**31. Statistics Job Interview Preparation Questions Part One!**

1. What are the factors that changed the role of women in today's society?

2. What are the factors involved in influencing the crime?

3. What is the purpose of interpersonal communication?

4. What are the different types of research possible?

5. What is the difference between subculture and counterculture?

6. What kind of impact is being given by social devaluation?

7. What are the different components of culture?

8. How social relations affect the individual relationship with one another?

9. What are the disadvantages of living in counter culture?

10. What are the disadvantages of having too much freedom?