Data.remove_duplicates

Data.remove_duplicates(xcol=None, delta=1e-08, strategy='keep first', ycol=None, yerr=None)

Find and remove rows with duplicated values of the search column(s).

Parameters:

datafile (Data) – Data object to work with if not being used as a bound method.

Keyword Arguments:
  • xcol (index types) – The column)s) to search for duplicates in.

  • delta (float or array) – The absolute difference(s) to consider equal when comparing floats.

  • strategy (str, default keep first) –

    What to do with duplicated rows. Options are:
    • keep first - the first row is kept, others are discarded

    • average - the duplicate rows are average together.

  • ycol (index) – When using an average strategey identifies columns that represent values.

  • yerr (index types) – When using an average strategey identifies columns uncertainties where the proper weighted standard error should be done.

Returns:

(dictionary of value

[list of row indices]):

The unique value and the associated rows that go with it.

Notes

If ycol is not specified, then the Data.setas attribute is used. If this is also not set, then all columns are considered.