Scatterplot creation

Creates a Scatterplot on PyRanges objects.

pyrangeyes.make_scatter(p, x: str = 'Start', y: str | None = None, color_by: str | None = None, size_by: str | None = None, title: str | None = None, title_size: int | None = None, title_color: str | None = None, height: int | None = None, y_space: int | None = None, engine: str | None = None)

Create a Scatter plot from a Pyranges object using Plotly.

This function generates a scatter plot for visualizing genomic variants or other data points based on the provided DataFrame. It allows customization of axes, marker sizes, colors, and plot titles.

Parameters:
  • p (pd.DataFrame) – Input DataFrame containing the genomic data with columns for start and end positions (e.r., ‘Start’ and ‘End’).

  • x (str, optional) – The column name to use for the x-axis. Defaults to ‘Start’.

  • y (str) – The column name to use for the y-axis.

  • color_by (str, default None) – The column name to use for coloring the markers. If specified, it aggregates unique positions based on this column. Defaults to None.

  • size_by (str, default None) – The column name to use for setting the marker sizes. If specified, it aggregates unique positions based on this column. Defaults to None.

  • title (str, default None) – The title of the plot. Defaults to None.

  • title_size (int, default None) – The font size of the plot title. Applicable only if title is specified. Defaults to None.

  • title_color (str, default None) – The color of the plot title. Applicable only if title is specified. Defaults to None.

  • height (int, default None) – Determines the length of the y axis Defaults to None.

  • y_space (int, default None) – The space between the main plot and the added plot Defaults to None.

Returns:

  • Returns a tuple with the go.Scatter object and a dictionary containing title customization options

Return type:

Union[go.Scatter, tuple]

Raises:

ValueError: – If x, y, color_by, or size_by columns are not found in the input DataFrame.

Examples

>>> import pyranges as pr
>>> p = pr.PyRanges({
...     "Chromosome": [1] * 5,
...     "Strand": ["+"] * 3 + ["-"] * 2,
...     "Start": [10, 20, 30, 25, 40],
...     "End": [15, 25, 35, 30, 50],
...     "transcript_id": ["t1"] * 3 + ["t2"] * 2,
...     "feature1": ["A", "B", "C", "A", "B"],
...     "Count": [1, 2, 3, 4, 5]  # Example count values
... })
>>> pre.make_scatter(p,y='Count')
(Scatter({
    'hovertemplate': '<b>Position:</b> %{x}<br><b>Count:</b> %{y}<extra></extra>',
    'marker': {'color': 'blue', 'size': 8},
    'mode': 'markers',
    'x': array([10, 20, 30, 25, 40]),
    'y': array([1, 2, 3, 4, 5])
}), {'title': 'Count'})