Questions on the use of mathematical techniques to extract properties from given data. Consider if your question might be more suited for Cross Validated (stats.SE) instead.
Questions tagged [data-analysis]
988 questions
29
votes
5 answers
Where to start learning about topological data analysis?
I was wondering if anyone could help me out with finding a nice introductory text for topological data analysis (I'm speaking as somebody who has two semesters of experience with topology, and much less experience with data analysis.) Are there any…
![](../../users/profiles/39117.webp)
Bachmaninoff
- 2,091
- 19
- 26
20
votes
2 answers
Roadmap for learning Topological Data Analysis?
I'm a math major who has recently graduated and I will be starting full time work in 'data analysis'.
Having finished with decent marks and still being incredibly interested in mathematics, I was thinking of pursuing graduate study/research at some…
![](../../users/profiles/400643.webp)
aevn
- 203
- 2
- 6
17
votes
5 answers
Polynomial fitting where polynomial must be monotonically increasing
Given a set of monotonically increasing data points (in 2D), I want to fit a polynomial to the data which is monotonically increasing over the domain of the data. If the highest x value is 100, I don't care what the slope of the polynomial is at…
![](../../users/profiles/2618.webp)
splicer
- 281
- 2
- 8
12
votes
4 answers
Mathematics base for data mining and artificial intelligence algorithms.
Could you give me some clarification about data mining and artificial intelligence algorithms? What mathematics base they used for? Could you give me starting point, in mathematics, to understand these types of algorithms?
![](../../users/profiles/35793.webp)
Dmitry Zagorulkin
- 251
- 3
- 13
10
votes
3 answers
Use a set of data points from a graph to find a derivative
I have a data logger that is recording the temperature readings from thermocouples at a specific interval. This gives me data points that I can graph where the x-coordinate is time and the y-coordinate is temperature. For each set of data points…
![](../../users/profiles/62311.webp)
Trillian522
- 101
- 1
- 1
- 3
9
votes
3 answers
Removing noise when the signal is not smooth
Suppose we have (an interval of) a time series of measurements:
We assume it can be explained as a "simple" underlying signal overlaid by noise. I'm interested in finding a good algorithm to estimate the value of the simple signal at a given point…
![](../../users/profiles/14366.webp)
hmakholm left over Monica
- 276,945
- 22
- 401
- 655
8
votes
2 answers
How do I find the formula (or rules) that created a list of numbers with seemingly no pattern?
Newbie here, and I apologize if this is the wrong forum for this type of question...
I have a group of 200 or so alphanumeric codes from an unknown source. Here's an example piece of the data…
![](../../users/profiles/429065.webp)
Ethan Allen
- 181
- 3
7
votes
2 answers
Value range of normalization methods? min-max, z-score, decimal scaling
I am working my way through Normalization (data transformation) of data and was curious about four methods:
min-max normalization, 2. z-score, 3. z-score mean absolute deviation, and 4. decimal scaling.
I am reading through a book so this is…
![](../../users/profiles/72766.webp)
Kairan
- 173
- 1
- 1
- 4
7
votes
0 answers
TDA and knot theory
I'm new to topological data analysis, and I learned some basics of it including persistent homology and mapper. In this paper, authors suggest a method to detect circle $S^{1}$, which is 1-dimensional object but can't be embedded into $\mathbb{R}$.…
![](../../users/profiles/350772.webp)
Seewoo Lee
- 13,929
- 2
- 17
- 42
7
votes
2 answers
Monitoring a data stream
You are monitoring a data stream which is delivering very many $32$-bit quantities at a
rate of $10$ Megabytes per second. You know that either:
$A$: All values occur equally often, $or$
$B$: Half of the values occur $2^{10}$ times more often than…
![](../../users/profiles/288949.webp)
Hogg
- 367
- 3
- 16
7
votes
5 answers
Finding the orientation of a noisy ellipse
This question comes from a neuroscience study which generates $12$ vectors. The vectors are evenly spaced, $30 n$ degrees for $n=0,\dots, 11$, each with their tail centered on the origin.
I am looking for biases in each set of vectors, in which one…
![](../../users/profiles/63072.webp)
Ryan
- 93
- 4
6
votes
2 answers
What is the most scientific way to assign weights to historical data?
This is a common question I usually face while processing historical data. I have year on year data of an event for the past N years.I would like to assign weights to the data of these N years so that the data corresponding to the most recent year…
![](../../users/profiles/60930.webp)
Nilotpal Sinha
- 15,471
- 4
- 26
- 73
6
votes
3 answers
Find the median given a table of relative frequencies
I came across the following GRE question. I had no problem finding the mean. However, the answer for the median is given to be 1. I don't understand how they arrive at this.
Find the mean and median of the values of the random variable $X$, whose…
![](../../users/profiles/34890.webp)
Joebevo
- 1,319
- 3
- 17
- 29
6
votes
0 answers
Explicit definition of a sequence
Suppose there are 6 sequences $a=(a_n)_{n\geq 0}, b=(b_n)_{n\geq 0},c=(c_n)_{n\geq 0},d=(d_n)_{n\geq 0},e=(e_n)_{n\geq 0},f=(f_n)_{n\geq 0}$, the data can be seen here: Data. I found out by trial and error that $a$ is defined…
![](../../users/profiles/74206.webp)
PaulH
- 433
- 2
- 14
6
votes
0 answers
Relationship between eigenvectors of correlation and covariance matrices
For the purpose of computing principal components of a dataset, represented as matrix $X$ of dimensions $n \times p$ with $n$ samples and $p$ features, we can compute sample covariance matrix $S$, and compute its eigenvalue decomposition:
$$
S =…
![](../../users/profiles/11069.webp)
Sasha
- 68,169
- 6
- 133
- 210