The Principal Dev – Masterclass for Tech Leads

The Principal Dev – Masterclass for Tech LeadsJuly 17-18

Join

observer_cli

Build Status GitHub tag MIT License Hex.pm Version Hex.pm Downloads Hex Docs

Observer CLI is a library to be dropped into any beam nodes, to be used to assist DevOps people diagnose problems in production nodes. Based on recon.

Home


Installation

Erlang

%% rebar.config
{deps, [observer_cli]}
%% erlang.mk
dep_observer_cli = hex 1.8.3

Elixir

# mix.exs
   def deps do
     [{:observer_cli, "~> 1.8"}]
   end

How-To

Try in local shell.

Erlang

%% rebar3 project
rebar3 shell
1> observer_cli:start().

Elixir

%% mix project
iex -S mix
iex(1)> :observer_cli.start

Monitor remote node

Erlang

%% rebar3 project
rebar3 shell --name 'observer_cli@127.0.0.1'
1> observer_cli:start('target@host', 'magic_cookie').

Elixir

%% mix project
iex --name "observer_cli@127.0.0.1" -S mix
iex(1)> :observer_cli.start(:'target@host', :'magic_cookie')

exclamation

ensure observer_cli application been loaded on target node.

Escriptize

  1. cd path/to/observer_cli/
  2. rebar3 escriptize to generate an escript executable containing the project's and its dependencies' BEAM files. Place script(_build/default/bin/observer_cli) anywhere in your path and use observer_cli command.
  3. observer_cli TARGETNODE [TARGETCOOKIE REFRESHMS] to monitor remote node.

Features

Home Panel

Home

The Home panel provides a comprehensive overview of your Erlang node:

erlang:system_info/1 returns specified information about the current system by below item. When the ratio is greater than 85%, it becomes red.

Metric Source/Limit
Proc Count process_count/process_limit
Port Count port_count/port_limit
Atom Count atom_count/atom_limit

PS report a snapshot of the beam process.

Command/Flag Description
ps -o pcpu cpu utilization of the process in "##.#" format. Currently, it is the CPU time used divided by the time the process has been running (cputime/realtime ratio), expressed as a percentage. It will not add up to 100% unless you are lucky.
ps -o pmem ratio of the process's resident set size to the physical memory on the machine, expressed as a percentage.

erlang:memory/0 Returns a list with information about memory dynamically allocated by the Erlang emulator.

erlang:statistics/1

Statistic Description
active task returns the same as statistics(active_tasks_all) with the exception that no information about the dirty IO run queue and its associated schedulers is part of the result. That is, only tasks that are expected to be CPU bound are part of the result.
context switches returns the total number of context switches since the system started.
reductions(total/sinceLastCall) total reductions/reductions since last call.
io The total number of bytes received/send through ports and the receive/send bytes through ports of growth during the refresh interval.
garbage_collection erlang:statistics(garbage_collection) which returns the total value and the {Number_of_GCs, Words_Reclaimed} of growth during the refresh interval.
run_queue The total length of all normal run-queues. That is, the number of processes and ports that are ready to run on all available normal run-queues. Dirty run queues are not part of the result.

Increments are values that are mostly useful when compared to a previous one to have an idea what they're doing, because otherwise they'd never stop increasing: bytes in and out of the node, number of garbage collector runs, words of memory that were garbage collected, and the global reductions count for the node.

Scheduler utilization by erlang:statistics(scheduler_wall_time):

Process

When looking for high memory usage, for example it's interesting to be able to list all of a node's processes and find the top N consumers. Enter m then press Enter will use the recon:proc_count(memory, N) function, we can get:

Top

recon:proc_count/2 and recon:proc_window/3 are to be used when you require information about processes in a larger sense: biggest consumers of given process memory, reductions, binary, total_heap_size, message_queue_len, either absolutely or over a sliding time window, respectively.

More detail about sliding time windows see recon:proc_window/3

When an abnormal process is found, enter the suspected process sequence(Integer) then press Enter will use erlang:process_info/2 to show a lot of information available (which is safe to use in production) about processes.

Process

Network

Network

Fetches a given attribute from all inet ports (TCP, UDP, SCTP) and returns the biggest Num consumers by recon:inet_count/2 and recon:inet_windows/3. Attribute name refer to inet:getstat/1.

When find out who is slowly but surely eating up all your bandwidth, enter the suspected port sequence(Integer) then press Enter will use recon:port_info/2 to show a lot of information available about port.

Port

System

System

ETS

Ets

ETS tables are never garbage collected, and will maintain their memory usage as long as records will be left undeleted in a table. Only removing records manually (or deleting the table) will reclaim memory.

Top N list sort by memory size, all items defined in ets:info/2

Mnesia

Mnesia

Top N list sort by memory size, all items defined in mnesia:table_info/2

Application

Application

Find application debug information by application_controller:info().

Join libs.tech

...and unlock some superpowers

GitHub

We won't share your data with anyone else.