VM watch for Azure VMs | Public Preview

0
250

A standardized, portable, and flexible in-VM service for virtual machines and virtual machine scale sets, VM Watch is currently in preview. At programmable intervals, it performs health checks inside the virtual machine and transmits the findings to Azure using a standard data model. Azure’s production monitoring AIOps (AI Operations) engines use these health results to identify and stop regressions. The Application Health VM addon is used to deliver VM Watch, giving users convenience in deployment and management. Additionally, VM Watch is provided to clients at no extra expense.

Details of the VM watch monitoring

  • Adoption simplicity: The Application Health VM addon makes VM Watch accessible.
    Flexible Deployment: Using the ARM template, PowerShell, or AZ CLI, users may easily enable VM Watch.
  • Compatibility: VM Watch runs without a hitch on Windows and Linux systems. Additionally, VM Watch can be used with both individual and VMSS virtual machines.
  • Resource Governance: Without affecting system performance, VM Watch offers effective monitoring. To safeguard the virtual machine, resource caps are applied to the CPU and memory usage of the VM watch process itself.
  • Ready Out-of-the-Box: VM Watch has a set of default tests that can be readily modified to allow for scenario-specific testing. Below is comprehensive information on the Tests (Checks, Metrics, and Event Logs).

 

Network:

Signal Name Type Description
Outbound connectivity Check Verify the network outbound connectivity from the Azure VM.
DNS Resolution Check Verify if the DNS name(s) can be resolved.
SegmentsRetransmitted Metric The number of TCP segments transmitted containing one or more previously transmitted octets.
NormalizedSegmentsRetransmitted Metric SegmentsRetransmitted / (SegmentsSent + SegmentsReceived)
ConnectionResets Metric Number of times TCP connections have made a direct transition to the CLOSED state from either the ESTABLISHED state or the CLOSE_WAIT state.
NormalizedConnectionResets Metric ConnectionResets / CurrentConnections
FailedConnectionAttempts Metric Number of times TCP connections have made a direct transition to the CLOSED state from either the SYN_SENT state or the SYN_RCVD state.
NormalizedFailedConnectionAttempts Metric FailedConnectionAttempts / (ActiveConnectionOpenings + PassiveConnectionOpenings)
ActiveConnectionOpenings Metric Number of times TCP connections have made a direct transition to the SYN_SENT state from the CLOSED state.
PassiveConnectionOpenings Metric Number of times TCP connections have made a direct transition to the SYN_RCVD state from the LISTEN state.
CurrentConnections Metric Number of connections established.
SegmentsReceived Metric Number of segments received, including those received in error.
SegmentsSent Metric Number of segments sent, including those on current connections but excluding those containing only retransmitted octets.

 

Disk:
Signal Name Type Description
Azure Disk I/O Check Verify file creation, write, read, delete operations on each drive mounted to the VM
FreeSpaceInBytes Metric The free disk space of the target mount point
UsedSpaceInBytes Metric The used disk space of the target mount point
CapacityInBytes Metric The disk space capacity of the target mount point
UsedPercent Metric The used disk space percentage of the target mount point
WriteOps Metric The write operations per second of the target disk/partition
ReadOps Metric The read operations per second of the target disk/partition
CPU:
Signal Name Type Description
ProcessCoreUsage Metric An instantaneous measurement of the percentage of a single CPU core that the target process is using (100 = 100%, a whole core)
ProcessMachineUsage Metric The percentage of the machine’s total CPU that this process is using
MachineTotalCpuUsage Metric The VM’s total instantaneous CPU utilization
Process:
Signal Name Type Description
Process Creation Check Starts a lightweight process to validate that process creation is possible
Running Process(es) Check Verify if the target process(es) are running
UpTime Metric How long the target process has been up and running since last process startup
IMDS:
Signal Name Type Description
IMDS Check Verify user can reach IMDS endpoint from within the VM and VM information is returned from the IMDS endpoint query
Clock:
Signal Name Type Description
Clock Skew Check Verify the clock skew between remote NTP server and the Azure VM. For Windows VM, fallback to check if Windows Time Service is synced with w32tm if remote NTP server is inaccessible
AzBlob:
Signal Name Type Description
Azure Storage blob connectivity Check Verify the connectivity to the Azure Storage Blob and download the Blob with MSI or SAS token
Hardware:
Signal Name Type Description
Hardware Health Monitor EventLog Collect hardware health info from Windows event log, currently only disk-related critical events are collected, including events with id 7, 500, 504, 505, 512, and 549

 

Learn more form this Microsoft Documentation: https://learn.microsoft.com/en-us/azure/virtual-machines/extensions/health-extension?tabs=rest-api

Comments

comments

LEAVE A REPLY

Please enter your comment!
Please enter your name here