kdcount.correlate module¶
Correlation function (pair counting) with KDTree.
Pair counting is the basic algorithm to calculate correlation functions. Correlation function is a commonly used metric in cosmology to measure the clustering of matter, or the growth of large scale structure in the universe.
We implement paircount
for pair counting. Since this is a discrete
estimator, the binning is modeled by subclasses of Binning
. For example
RBinning
RmuBinning
XYBinning
- :py:class: FlatSkyBinning
- :py:class: FlatSkyMultipoleBinning
kdcount takes two types of input data: ‘point’ and ‘field’.
kdcount.models.points
describes data with position and weight. For example, galaxies and
quasars are point data.
point.pos is a row array of the positions of the points; other fields are
used internally.
point.extra is the extra properties that can be used in the Binning. One use
is to exclude the Lyman-alpha pixels and Quasars from the same sightline.
kdcount.models.field
describes a continious field sampled at given positions, each sample
with a weight; a notorious example is the over-flux field in Lyman-alpha forest
it is a proxy of the over-density field sampled along quasar sightlines.
In the Python Interface, to count, one has to define the ‘binning’ scheme, by
subclassing Binning
. Binning describes a multi-dimension binning
scheme. The dimensions can be derived, for example, the norm of the spatial
separation can be a dimension the same way as the ‘x’ separation. For example,
see RmuBinning
.
-
class
kdcount.correlate.
Binning
(dims, edges, compute_mean_coords=False)[source]¶ Bases:
object
Binning of the correlation function. Pairs whose distance is within a bin is counted towards the bin.
Attributes
dims ( array_like) internal; descriptors of binning dimensions. edges ( array_like) edges of bins per dimension centers ( array_like) centers of bins per dimension; currently it is the mid point of the edges. compute_mean_coords (bool, optional) If True, store and compute the mean coordinate values in the __call__ function. Default is False -
digitize
(r, i, j, data1, data2)[source]¶ Calculate the bin number of pairs separated by distances r, Use
linear()
to convert from multi-dimension bin index to linear index.Parameters: r : array_like
separation
i, j : array_like
index (i, j) of pairs.
data1, data2 :
The position of first point is data1.pos[i], the position of second point is data2.pos[j].
-
linear
(**tobin)[source]¶ Linearize bin indices.
This function is called by subclasses. Refer to the source code of
RBinning
for an example.Parameters: args : list
a list of bin index, (xi, yi, zi, ..)
Returns: linearlized bin index
-
-
class
kdcount.correlate.
FlatSkyBinning
(rbins, Nmu, los, **kwargs)[source]¶ Bases:
kdcount.correlate.Binning
Binning in R and mu, in the flat sky approximation, such that all pairs have the same line-of-sight, which is taken to be the axis specified by the los parameter (default is the last dimension)
Parameters: rmax : float
the maximum radius to measure to
Nr : int
the number of bins in r direction.
Nmu : int
the number of bins in mu direction.
los : int, {0, 1, 2}
the axis to treat as the line-of-sight
-
class
kdcount.correlate.
FlatSkyMultipoleBinning
(rbins, ells, los, **kwargs)[source]¶ Bases:
kdcount.correlate.Binning
Binning in R and ell, the multipole number, in the flat sky approximation, such that all pairs have the same line-of-sight, which is taken to be the axis specified by the los parameter (default is the last dimension)
Parameters: rmax : float
the maximum radius to measure to
Nr : int
the number of bins in r direction.
ells : list of int
the multipole numbers to compute
los : int, {0, 1, 2}
the axis to treat as the line-of-sight
-
class
kdcount.correlate.
RBinning
(rbins, **kwargs)[source]¶ Bases:
kdcount.correlate.Binning
Binning along radial direction.
Parameters: Rmax : float
max radius to go to
Nbins : int
number of bins in each direction.
-
class
kdcount.correlate.
RmuBinning
(rbins, Nmu, observer, **kwargs)[source]¶ Bases:
kdcount.correlate.Binning
Binning in R and mu (angular along line of sight) mu = cos(theta), relative to line of sight from a given observer.
Parameters: Rmax : float
max radius to go to
Nbins : int
number of bins in R direction.
Nmubins : int
number of bins in mu direction.
observer : array_like (Ndim)
location of the observer (for line of sight)
-
class
kdcount.correlate.
XYBinning
(Rmax, Nbins, observer, **kwargs)[source]¶ Bases:
kdcount.correlate.Binning
Binning along Sky-Lineofsight directions.
The bins are be (sky, los)
Parameters: Rmax : float
max radius to go to
Nbins : int
number of bins in each direction.
observer : array_like (Ndim)
location of the observer (for line of sight)
Notes
- with numpy imshow , the second axis los, will be vertical
- with imshow( ..T,) the sky will be vertical.
-
kdcount.correlate.
compute_sum_values
(i, j, data1, data2)[source]¶ Return the sum1_ij and sum2_ij values given the input indices and data instances.
Parameters: i,j : array_like
the bin indices for these pairs
data1, data2 : points, field instances
the two points or field objects
Returns: sum1_ij, sum2_ij : float, array_like (N,...)
contributions to sum1, sum2 – either a float or array of shape (N, ...) where N is the length of i, j
Notes
This is called in Binning.__call__ to compute the sum1 and sum2 contributions for indices (i,j)
-
class
kdcount.correlate.
paircount
(data1, data2, binning, usefast=True, np=None)[source]¶ Bases:
object
Paircounting via a KD-tree, on two data sets.
Notes
The value of sum1 and sum2 depends on the types of input
- For
kdcount.models.points
andkdcount.models.points
: - sum1 is the per bin sum of products of weights
- sum2 is always 1.0
- For
kdcount.models.field
andkdcount.models.points
: - sum1 is the per bin sum of products of weights and the field value
- sum2 is the per bin sum of products of weights
- For
kdcount.models.field
andkdcount.models.field
: - sum1 is the per bin sum of products of weights and the field value (one value per field)
- sum2 is the per bin sum of products of weights
With this convention the usual form of Landy-Salay estimator is ( for points x points:
(DD.sum1 -2r DR.sum1 + r2 RR.sum1) / (r2 RR.sum1)
with r = sum(wD) / sum(wR)
Attributes
sum1 ( array_like) the numerator in the correlator sum2 ( array_like) the denominator in the correlator centers (list) the centers of the corresponding corr bin, one item per binning direction. edges ( list) the edges of the corresponding corr bin, one item per binning direction. binning ( Binning
) binning object of this paircountdata1 ( dataset
) input data set1. It can be eitherfield
for discrete sampling of a continuous field, orkdcount.models.points
for a point set.data2 ( dataset
) input data set2, see above.np (int) number of parallel processes. set to 0 to disable parallelism - For