Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(typha): add new integration #2545

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

cmotta2016
Copy link

What does this PR do?

Create a new integration called Typha.

Motivation

We need to monitor the entire Calico stack, but there is no integration for the Typha component. So we write this custom agent check to collect Typha metrics and send it to the Datadog.

Review checklist

  • PR has a meaningful title or PR has the no-changelog label attached
  • Feature or bugfix has tests
  • Git history is clean
  • If PR impacts documentation, docs team has been notified or an issue has been opened on the documentation repo
  • If this PR includes a log pipeline, please add a description describing the remappers and processors.

Additional Notes

Anything else we should know when reviewing?

@cmotta2016 cmotta2016 requested review from a team as code owners November 22, 2024 18:05
@cmotta2016
Copy link
Author

Hey guys. Would anyone be able to help me with the failed checks?
(About validation I will set the sales_email)

@drichards-87
Copy link
Contributor

Created a Jira card for Docs Team editorial review.

@drichards-87 drichards-87 added the editorial review Waiting on a more in-depth review from a docs team editor label Nov 22, 2024
@drichards-87
Copy link
Contributor

Hey guys. Would anyone be able to help me with the failed checks? (About validation I will set the sales_email)

I'm not too sure about those. Someone from the Agent Integrations team should be able to help you out.

@dd-dominic
Copy link
Collaborator

@cmotta2016 Are you working with someone from the Datadog team already? If not, can you please share your email address so we can communicate on a separate thread.

@cmotta2016
Copy link
Author

Hello @dd-dominic , we are contacting our partner (Apoena) to request a review of this PR with Datadog.


This check monitors [Typha][1] to collect Prometheus metrics through the Datadog Agent.

## Enabling Prometheus Metrics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Enabling Prometheus Metrics
## Enabling Prometheus metrics

TYPHA_PROMETHEUSMETRICSPORT: "9093"
```

See the official [documentation][2] for more information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
See the official [documentation][2] for more information.
See [Configuring Typha][2] for more information.


### Validation

[Run the Agent's status subcommand][5] and look for `typha` under the Checks section.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[Run the Agent's status subcommand][5] and look for `typha` under the Checks section.
Run the Agent's [status subcommand][5] and look for `typha` under the Checks section.


[Run the Agent's status subcommand][5] and look for `typha` under the Checks section.

For containerized environment:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For containerized environment:
For a containerized environment, use the following command:

options:
- name: prometheus_url
required: true
description: The metric endpoint of your typha instance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: The metric endpoint of your typha instance.
description: The metric endpoint of your Typha instance.

typha.breadcrumb.size.count,count,,,,Number of KVs recorded in each breadcrumb.,0,typha,,,
typha.breadcrumb.size.sum,count,,,,Number of KVs recorded in each breadcrumb.,0,typha,,,
typha.client.latency.secs.count,count,,,,Per-client latency. I.e. how far behind the current state is each client.,0,typha,,,
typha.client.latency.secs.sum,count,,,,Per-client latency. I.e. how far behind the current state is each client.,0,typha,,,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
typha.client.latency.secs.sum,count,,,,Per-client latency. I.e. how far behind the current state is each client.,0,typha,,,
typha.client.latency.secs.sum,count,,,,Per-client latency. That is, how far behind the current state is each client.,0,typha,,,

typha.client.write.latency.secs.sum,count,,,,Per-client write. How long each write call is taking.,0,typha,,,
typha.connections.accepted,count,,,,Total number of connections accepted over time.,0,typha,typha.connections.accepted,,
typha.connections.active,gauge,,,,Number of open client connections (including connections that have not completed the handshake).,0,typha,typha.connections.active,,
typha.connections.streaming,gauge,,,,Number of client connections that are actively streaming (i.e. connections that successfully completed the handshake).,0,typha,typha.connections.streaming,,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
typha.connections.streaming,gauge,,,,Number of client connections that are actively streaming (i.e. connections that successfully completed the handshake).,0,typha,typha.connections.streaming,,
typha.connections.streaming,gauge,,,,Number of client connections that are actively streaming (that is, connections that successfully completed the handshake).,0,typha,typha.connections.streaming,,

typha.next.breadcrumb.latency.secs.sum,count,,,,Time to retrieve next breadcrumb when already behind.,0,typha,,,
typha.ping.latency.count,count,,,,Round-trip ping/pong latency to client. Typha's protocol includes a regular ping/pong keepalive to verify that the connection is still up.,0,typha,,,
typha.ping.latency.sum,count,,,,Round-trip ping/pong latency to client. Typha's protocol includes a regular ping/pong keepalive to verify that the connection is still up.,0,typha,,,
typha.updates.skipped,count,,,,Total number of updates skipped because the datastore change was not relevant. (For example an update to a Kubernetes Pod field that Calico does not read.),0,typha,typha.updates.skipped,,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
typha.updates.skipped,count,,,,Total number of updates skipped because the datastore change was not relevant. (For example an update to a Kubernetes Pod field that Calico does not read.),0,typha,typha.updates.skipped,,
typha.updates.skipped,count,,,,Total number of updates skipped because the datastore change was not relevant. (For example, an update to a Kubernetes Pod field that Calico does not read.),0,typha,typha.updates.skipped,,

typha.snapshots.reused,count,,,,The number of binary snapshots that Typha was able to reuse for multiple clients thus reducing CPU usage.,0,typha,typha.snapshots.reused,,
typha.snapshot.raw.bytes,gauge,,,,The size of the most recent binary snapshot in bytes pre-compression.,0,typha,typha.snapshot.raw.bytes,,
typha.snapshots.compressed.bytes,gauge,,,,The size of the most recent binary snapshot in bytes post-compression.,0,typha,typha.snapshots.compressed.bytes,,
typha.breadcrumb.block,count,,,,Count of the number of times Typha got the next Breadcrumb after blocking.,0,typha,typha.breadcrumb.block.,,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
typha.breadcrumb.block,count,,,,Count of the number of times Typha got the next Breadcrumb after blocking.,0,typha,typha.breadcrumb.block.,,
typha.breadcrumb.block,count,,,,Count of the number of times Typha got the next breadcrumb after blocking.,0,typha,typha.breadcrumb.block.,,

typha.snapshot.raw.bytes,gauge,,,,The size of the most recent binary snapshot in bytes pre-compression.,0,typha,typha.snapshot.raw.bytes,,
typha.snapshots.compressed.bytes,gauge,,,,The size of the most recent binary snapshot in bytes post-compression.,0,typha,typha.snapshots.compressed.bytes,,
typha.breadcrumb.block,count,,,,Count of the number of times Typha got the next Breadcrumb after blocking.,0,typha,typha.breadcrumb.block.,,
typha.breadcrumb.non.block,count,,,,Count of the number of times Typha got the next Breadcrumb without blocking.,0,typha,typha.breadcrumb.non.block,,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
typha.breadcrumb.non.block,count,,,,Count of the number of times Typha got the next Breadcrumb without blocking.,0,typha,typha.breadcrumb.non.block,,
typha.breadcrumb.non.block,count,,,,Count of the number of times Typha got the next breadcrumb without blocking.,0,typha,typha.breadcrumb.non.block,,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
editorial review Waiting on a more in-depth review from a docs team editor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants