path_finding

path_finding_utils

class decitala.path_finding.path_finding_utils.CostFunction(**kwargs)[source]

Bases: object

Arbitrary cost function to use in the cost functions. The user should set weights as class attributes that are referenced in the cost function (which the user must override), if needed. If set, the weights should (probably) sum to 1.

The following is a cost function that doesn’t rely on any weights.

>>> class MyFirstCostFunction(CostFunction):
...     def cost(self, vertex_a, vertex_b):
...             '''Cost function determined by the sum of the two extractions standard deviations.'''
...             return vertex_a.std() + vertex_b.std()

The following is a cost function that relies on two weights.

>>> class MySecondCostFunction(CostFunction):
...     weight_a = 0.4213
...     weight_b = 0.2599
...     def cost(self, vertex_a, vertex_b):
...             '''Cost function determined by the sum of the two extractions standard deviations.'''
...             first_term = ((weight_a * vertex_a.num_onsets) + weight_b)
...             second_term = ((weight_a * vertex_b.num_onsets) + weight_b)
...             return first_term + second_term
cost(vertex_a, vertex_b)[source]

This function must be overrided by child classes. Cost function between two decitala.search.Extraction objects.

Parameters
Returns

The cost of moving from vertex_1 to vertex_2.

Return type

float

class decitala.path_finding.path_finding_utils.CostFunction2D(gap_weight=0.75, onset_weight=0.25)[source]

Bases: decitala.path_finding.path_finding_utils.CostFunction

Default cost function used in the path-finding algorithms. Weights optimized by hyperparameter search.

cost(vertex_a, vertex_b)[source]

This function must be overrided by child classes. Cost function between two decitala.search.Extraction objects.

Parameters
Returns

The cost of moving from vertex_1 to vertex_2.

Return type

float

class decitala.path_finding.path_finding_utils.CostFunction3D(gap_weight=0.8, onset_weight=0.1, articulation_weight=0.1)[source]

Bases: decitala.path_finding.path_finding_utils.CostFunction

cost(vertex_a, vertex_b)[source]

This function must be overrided by child classes. Cost function between two decitala.search.Extraction objects.

Parameters
Returns

The cost of moving from vertex_1 to vertex_2.

Return type

float

decitala.path_finding.path_finding_utils.build_graph(data, cost_function_class=<decitala.path_finding.path_finding_utils.CostFunction3D object>, verbose=False)[source]

Function for building a “graph” of nodes and edges from a given set of data (each vertex of the form as those required in the cost function) extracted from one of the search algorithms. Requires id keys in each dictionary input.

Parameters
Returns

A “graph” holding vertices and the associated cost between all other non-negative edges.

Return type

dict

decitala.path_finding.path_finding_utils.sources_and_sinks(data, enforce_earliest_start=False)[source]

Calculates all sources and sinks in a given dataset.

Parameters
  • data (list) – a list of decitala.search.Extraction objects.

  • enforce_earliest_start (bool) – whether to require that all sources begin at the earliest detected onset.

decitala.path_finding.path_finding_utils.best_source_and_sink(data, enforce_earliest_start=False)[source]

TODO: this is bad. I should be using the agnostic approach of Dijkstra here.

Calculates the “best” source and sink from a dataset based on two simple heuristics: (1) the fragment with the earliest (or latest, for sink) starting point, (2) the fragment with the greatest number of onsets.

Parameters

data (list) – a list of decitala.search.Extraction objects.

decitala.path_finding.path_finding_utils.make_2D_grid(resolution)[source]

Function for generating a grid of two numbers that sum to 1, iterated over the given resolution.

Parameters

resolution (float) – resolution of the grid, \(0 < x <= 1\).

decitala.path_finding.path_finding_utils.make_3D_grid(resolution)[source]

Function for generating a grid of three numbers that sum to 1, iterated over the given resolution.

Parameters

resolution (float) – resolution of the grid, \(0 < x <= 1\).

decitala.path_finding.path_finding_utils.make_4D_grid(resolution)[source]

Function for generating a grid of four numbers that sum to 1, iterated over the given resolution.

Parameters

resolution (float) – resolution of the grid, \(0 < x <= 1\).

decitala.path_finding.path_finding_utils.default_split_dict()[source]

Default splits for common compound greek metrics. Splits are either obvious (e.g. triiamb) or provided by Messiaen.

decitala.path_finding.path_finding_utils.split_extractions(data, all_res, split_dict={<fragment.GreekFoot Diiamb>: [<fragment.GreekFoot Iamb>, <fragment.GreekFoot Iamb>], <fragment.GreekFoot Ditrochee>: [<fragment.GreekFoot Trochee>, <fragment.GreekFoot Trochee>], <fragment.GreekFoot Dicretic>: [<fragment.GreekFoot Amphimacer>, <fragment.GreekFoot Amphimacer>], <fragment.GreekFoot Dianapest>: [<fragment.GreekFoot Anapest>, <fragment.GreekFoot Anapest>], <fragment.GreekFoot Didactyl>: [<fragment.GreekFoot Dactyl>, <fragment.GreekFoot Dactyl>], <fragment.GreekFoot Diproceleusmatic>: [<fragment.GreekFoot Proceleusmatic>, <fragment.GreekFoot Proceleusmatic>], <fragment.GreekFoot Dochmius>: [<fragment.GreekFoot Iamb>, <fragment.GreekFoot Amphimacer>], <fragment.GreekFoot Triiamb>: [<fragment.GreekFoot Iamb>, <fragment.GreekFoot Iamb>, <fragment.GreekFoot Iamb>], <fragment.GreekFoot Triproceleusmatic>: [<fragment.GreekFoot Proceleusmatic>, <fragment.GreekFoot Proceleusmatic>, <fragment.GreekFoot Proceleusmatic>]})[source]

TODO: rename all_res to all_extractions. Function for splitting a list of extraction objects by a given split_dict.

Parameters
  • data (list) – a list of decitala.search.Extraction objects (corresponding to, probably, a path of fragments).

  • data – a list of decitala.search.Extraction objects (corresponding to the complete extractions from a filepath-part.

  • split_dict (dict) – the dictionary used to split the extracted fragments into their components. Default is path_finding_utils.split_dict

decitala.path_finding.path_finding_utils.check_accuracy(training_data, calculated_data, mode, return_list)[source]

The training_data is the analysis as provided by Messiean. The input_data is the data calculated by path-finding.

NOTE: the data is stored in two different formats, hence the use of mode. This will (hopefully) be fixed in the future.

dijkstra

decitala.path_finding.dijkstra.dijkstra(data, graph, source, cost_function_class=<decitala.path_finding.path_finding_utils.CostFunction3D object>)[source]

Dijkstra path-finding algorithm from dynamic programming. Uses a min-heap data structure for efficiency.

Parameters
decitala.path_finding.dijkstra.dijkstra_best_source_and_sink(data, cost_function_class=<decitala.path_finding.path_finding_utils.CostFunction3D object>, enforce_earliest_start=False, verbose=False)[source]

Function for agnostically choosing the best source and target (and associated predecessor set) via Dijkstra. Only requires regular data input.

Parameters
decitala.path_finding.dijkstra.generate_path(pred, source, target)[source]

Returns the optimal path extracted from Dijkstra.

Parameters

floyd_warshall

Implementation of the Floyd-Warshall Algorithm (path of minimal cost).

decitala.path_finding.floyd_warshall.floyd_warshall(data, cost_function_class=<decitala.path_finding.path_finding_utils.CostFunction3D object>, verbose=False)[source]

Calculates the distance and next matrices of the Floyd-Warshall path-finding algorithm.

Parameters
Returns

Two matrices of size len(data) x len(data): first is the weighted adjacency matrix, the second is the matrix used for path reconstruction.

Return type

tuple

decitala.path_finding.floyd_warshall.reconstruct_standard_path(data, next_matrix, start, end)[source]
Parameters

data (list) – a list of decitala.search.Extraction objects.

decitala.path_finding.floyd_warshall.get_path(start, end, next_matrix, data, slur_constraint=False)[source]

Function for retriving the best path extracted from the Floyd-Warshall algorithm.

Parameters
Returns

best path extracted using the Floyd-Warshall algorithm.

Return type

list

pofp

NOTE: M. Raynard helpfully provided a technique for solving the given problem (an iterative approach to the end-overlapping indices problem) in a StackOverflow post from Summer, 2020. The link to the original post is: https://stackoverflow.com/questions/62734114/iterative-solution-to-end-overlapping-indices.

decitala.path_finding.pofp.check_break_point(data, i)[source]

Helper function for get_break_points. Checks index i of the onset_list that all values prior to it are less than or equal to \(b_i\) and \(s_i\). If True, this means that the data at index i is >= all previous.

Parameters
  • data (list) – data from rolling_search.

  • i (int) – index of the data to check.

Returns

whether or not the queried index is a break point.

Return type

bool

decitala.path_finding.pofp.get_break_points(data)[source]
Parameters

data (list) – data from rolling_search.

Returns

every index in the input at which the data is at most end-overlapping.

Return type

list

decitala.path_finding.pofp.partition_data_by_break_points(data)[source]

Partitions the input data according to all calculated breakpoints.

decitala.path_finding.pofp.get_pareto_optimal_longest_paths(data)[source]
>>> from decitala.search import Extraction
>>> from decitala.fragment import GreekFoot, GeneralFragment
>>> data = [
...     Extraction(fragment=GreekFoot("Spondee"), onset_range=(0.0, 0.5), retrograde=False, factor=0.125, difference=0.0, mod_hierarchy_val=1, pitch_content=[None], is_spanned_by_slur=False, slur_count=0, slur_start_end_count=0, id_=1), # noqa
...     Extraction(fragment=GeneralFragment([0.25, 0.25], name="cs-test1"), onset_range=(0.0, 0.5), retrograde=False, factor=2.0, difference=0.0, mod_hierarchy_val=1, pitch_content=[None], is_spanned_by_slur=False, slur_count=0, slur_start_end_count=0, id_=2), # noqa
...     Extraction(fragment=GreekFoot("Trochee"), onset_range=(0.25, 0.625), retrograde=False, factor=0.125, difference=0.0, mod_hierarchy_val=1, pitch_content=[None], is_spanned_by_slur=False, slur_count=0, slur_start_end_count=0, id_=3), # noqa
...     Extraction(fragment=GeneralFragment([0.25, 0.125], name="cs-test2"), onset_range=(0.25, 0.625), retrograde=False, factor=0.125, difference=0.0, mod_hierarchy_val=1, pitch_content=[None], is_spanned_by_slur=False, slur_count=0, slur_start_end_count=0, id_=4), # noqa
...     Extraction(fragment=GreekFoot("Dactyl"), onset_range=(0.5, 1.0), retrograde=False, factor=0.125, difference=0.0, mod_hierarchy_val=1, pitch_content=[None], is_spanned_by_slur=False, slur_count=0, slur_start_end_count=0, id_=5) # noqa
... ]
>>> for path in get_pareto_optimal_longest_paths(data):
...     for fragment in path:
...             print(fragment.fragment, fragment.onset_range)
...     print("-----")
<fragment.GreekFoot Spondee> (0.0, 0.5)
<fragment.GreekFoot Dactyl> (0.5, 1.0)
-----
<fragment.GeneralFragment cs-test1: [0.25 0.25]> (0.0, 0.5)
<fragment.GreekFoot Dactyl> (0.5, 1.0)
-----
<fragment.GreekFoot Trochee> (0.25, 0.625)
-----
<fragment.GeneralFragment cs-test2: [0.25  0.125]> (0.25, 0.625)
-----