ISP Essentials [PDF]

  • 0 0 0
  • Suka dengan makalah ini dan mengunduhnya? Anda bisa menerbitkan file PDF Anda sendiri secara online secara gratis dalam beberapa menit saja! Sign Up
File loading please wait...
Citation preview

7/11/19



ISP Essentials Workshop – Network Monitoring Manila, Philippines 8-12 July 2019



1



Agenda • Intro to Network Management • Configuration Management • Device Monitoring • Flow Monitoring • Log Management



2



1



7/11/19



Module 1



INTRO TO NETWORK MANAGEMENT



3



Hosts and Services • Host – Container for services – Can be physical or virtual – Both have CPU, Disk, Memory, Network interfaces – Physical hosts also have



• Service – – – –



An application software Runs on a host Have allocated resources Have vendors / suppliers



• Vendors, service contracts • Power supplies, temperature



4



2



7/11/19



Managing Config Data • Some Host Configuration Data to Track – Physical Device Locations – Installed CPU, Disk, Memory, Network Interfaces – Serial Numbers, Licenses, OS Revision & Patch Details



• Some Service Configuration Data to Track – Allocated Resources, Network Ports – Service Permissions, Filters and ACLs, Logging – Software Revision & Patch Details 5



Why Manage Config Data? • Match Resource Allocation to Revenue Generation • Ensure our Hosts and applications have Secure configuration • Correlate operational results with config changes • Roll back or restore config when fault occurs



6



3



7/11/19



Operational Data • Host – – – – – –



CPU Utilisation Memory Utilisation Disk Utilisation Network Interface Utilisation Fan State Port Errors



• Service – – – –



Time to Respond to Request Processes in Use Queue Length State of a BGP session



7



Operational Data • Availability – Applies to Hosts & Services – Percent of time host or service is performing to specification – Typically measured as a percent, for example 99.99% – Excludes planned outages



• Performance



• Reachability – Applies to Hosts & Services – Percent of time host or service is reachable – Typically measured as a percent, for example 99.99% – Unreachable hosts may not be unavailable to everyone – Unreachable hosts may be available from another location



– Time to respond to request or forward packet – Megabits or Packets Per Second – Discards, Errors, Loss 8



4



7/11/19



Why Monitor Operational Data • Know about Problems Before your Customers Call • Prove Hosts & Services are Delivering on SLAs • Continue to Meet SLAs as your Network Grows



9



Common NMM Tools



10



5



7/11/19



Common Back-end Tools • Data storage – Config files, formats and locations – Databases: SQL, key-pair, not SQL



• RRDTool – Explain the idea of a round-robin database



• Check_mk – Explain the idea of a service checking



• Nagios Plugins – Explains what is Nagios and what are plugins 11



Network Automation • A continuous process of generation and deployment of configuration changes, management, and operations of network devices (from Network Automation at Scale)



12



6



7/11/19



Network Automation • Automating config management • Including config changes based on operational data • Orchestrated with tools like Ansible Chef, Puppet, and Salt • This is the next step in network monitoring and management



13



Module 2



ADDRESS MANAGEMENT



14



7



7/11/19



Address Management • planning and managing the assignment and use of IP addresses and closely related resources of a computer network. • IP Address Management (IPAM) tools – Racktables – Netbox – A lot of others (commercial and open source)



https://en.wikipedia.org/wiki/IP_address_management



15



Tools - Racktables • Asset management tool



https://www.racktables.org/demo.php 16



8



7/11/19



Tools - Netbox • open source web application designed to help manage and document computer networks.



https://netbox.readthedocs.io/en/stable/ 17



Module 3



CONFIG MANAGEMENT



18



9



7/11/19



Network Device Configuration • How to configure device? – – – – –



Using the command line (Cisco) From a special tool (Mikrotik) From a web interface (Procurve) JSON files (Arista) XML files (Juniper)



• Who configures the device? • How often do changes happen? 19



Why do you need to manage config? • Know when changes are done • Restore config during failure • Rollback changes with unexpected outcome • Track config changes throughout time (history)



20



10



7/11/19



What is Version Control? • Also known as revision control or source control • Manages changes to files or documents with a revision number • Allows users to find and highlight changes • Allows users to restore previous versions of a file or document



21



What’s a Diff? • A comparison of two versions of a single file or document • Highlighting the changes between the two versions • Allowing users to quickly see only what’s changed



22



11



7/11/19



What’s a Diff?



23



Config Management Tools • Retrieve configuration files • Allow for their storage as files or in versioning system • Solve many problems with network operations



24



12



7/11/19



Tools - Rancid • Really Awesome New Cisco config differ • monitors a router's (or more generally a device's) configuration • Uses CVS, Subversion, or Git to maintain history • Supports Cisco, Foundry, HP, Juniper, and more • Runs on BSD, Linux, Mac OS • Pros: – The de-facto industry standard for config management https://www.shrubbery.net/rancid/ 25



Rancid Example Index: configs/dc1-gw1 =================================================================== retrieving revision 1.677 diff -U 4 -r1.677 dc1-gw1 @@ -713,8 +713,10 @@ remark permit eduroam to beta-login permit tcp any host 204.111.222.3 eq www 443 remark permit eduroam to stats permit tcp any host 204.111.222.4 eq www 443 + remark permit eduroam to net-api + permit tcp any host 204.111.222.5 eq www 443 remark temp deny access to all deny ip any 204.111.222.0 0.0.0.64



26



13



7/11/19



Rancid Example Index: configs/dc1-gw =================================================================== retrieving revision 1.2213 diff -U 4 -r1.2213 dc1-gw @@ -32,9 +32,8 @@ ! !Flash: bootflash: Directory of bootflash:/ !Flash: bootflash:



11 drwx



!Flash: bootflash:



12 -rw-



- !Flash: bootflash:



13 -rw-



!Flash: bootflash: 48769 drwx



16384 Jan 11 2017 12:13:18 +10:00 lost+found 371180156 Oct 5 2018 14:05:16 +10:00 asr1000rp1-adventerprisek9.03.13.10.S.154-3.S10-ext.bin 4 Jul 9 2019 15:15:03 +10:00 .issu_loc_lock 4096 Jan 11 2017 12:16:08 +10:00 .installer



!Flash: bootflash: 438913 drwx



4096 Jan 11 2017 13:05:11 +10:00 core



!Flash: bootflash: 829057 drwx



4096 Oct 11 2018 07:24:32 +10:00 .prst_sync



!Flash: bootflash: 520193 drwx



4096 Jan 11 2017 12:19:19 +10:00 .rollback_timer



27



Tools - Oxidized • network device configuration backup tool (to replace Rancid) • Stores files in a version control system • Supports a large number of manufacturer – – – –



Cisco (CatOS, IOS, IOSXR, NXOS) Juniper (JunOS, ScreenOS) Huawei (VRP, SmartAX) Mikrotik (RouterOS)



• Pros: – Integrates with LibreNMS https://github.com/ytti/oxidized 28



14



7/11/19



Other Tools • Fetchconfig • Jazigo



29



Module 4



DEVICE MONITORING



30



15



7/11/19



Intro to SNMP • Simple Network Management Protocol • Used to communicate management information between the network management stations and the agents in the network elements. • Even though SNMP is a protocol, we use the term SNMP to describe the complete architecture of the management system 31



Intro to SNMP • Network management stations execute management applications which monitor and control network elements. • Network elements are devices such as hosts, gateways, terminal servers • The agent is a piece of software that runs on the network devices you are managing. It can be a separate program, or it can be incorporated into the operating system. Agents listen and respond on UDP port 161. 32



16



7/11/19



SNMP Polling, Traps and MIB • SNMP Polling is the act of querying an agent for some piece of information. SNMP managers use UDP to poll agents • A trap is way for the agent to tell the NMS that something has happened. Traps are sent asynchronously, not in response to queries from the NMS. SNMP traps are sent using UDP port 162. • MIB or Management Information Base is a database of managed objects that the agent tracks. Any sort of status or statistical information accessed by the NMS is defined in an MIB. – OID or object identifier is the name of a management object. OIDs are globally unique 33



SNMP Applications • LibreNMS • MRTG • PRTG • …



34



17



7/11/19



Beyond SNMP • SNMP is a heavy-weight protocol with low information density • SNMP was not designed for streaming high resolution data • It’s seen as too slow, incomplete, network-specific, and hard to operationalize New protocols are being developed to stream telemetry data in real-time • Yang data model • XML, JSON and GBP encoding • Data pushed from agents, not requested from Managers • UDP, TCP or gRPC transport available 35



Tools - LibreNMS • An open-source network monitoring system (NMS) • Capable of managing small or big networks • Most management functions are supported or can be integrated • Details under the hood: – Written in PHP, derived from the Observium project – Configuration in MySQL – Operational data is stored in Round Robin Database files https://www.librenms.org/ 36



18



7/11/19



LibreNMS Dashboard



37



Tools – Sensu • Sensu is a multi-cloud monitoring system that allows for automating monitoring workflow – Monitor containers, instances, applications, and on-premises infrastructure – Integrates with PagerDuty, Slack, Grafana, etc



• Sensu Go is the latest version • Uchiwa is an open-source dashboard for the Sensu monitoring framework https://sensu.io/about/ 38



19



7/11/19



Sensu / Uchiwa Dashboard



https://github.com/sensu/uchiwa 39



Tools - Grafana • Open platform for monitoring and analytics • Does time series analytics • Plugins to integrate with other applications



40



20



7/11/19



Grafana Dashboard



https://grafana.com/ 41



Module 5



FLOW MONITORING



42



21



7/11/19



What is a Flow? • A flow is defined as a unidirectional sequence of packets with some common properties that pass through a network device. (RFC3954)



43



Why do we monitor IP flows? • Where is our traffic coming from? • What kind of application traffic is it? • Are the correct QoS bits set? • Have routing changes impacted the network



44



22



7/11/19



What’s Netflow? • Cisco protocol for flow monitoring released in 1996 • Described by RFC3954, but not an Internet Standard • Netflow V5 is supported by nearly all router platofrms • Versions: – Version 5: Ipv4 only – Version 9: IPv4/v6 and MPLS



45



What is IPFIX? • IP Flow Information Export • Vendor neutral protocol for flow monitoring • Started through the IETF process in 2004 & released in 2011 • Based on Cisco’s Netflow version9 • IPFIX is an Internet Standard replacement for version 9



46



23



7/11/19



How do Netflow and IPFIX work? • Packets with matching tuples are grouped into a flow • First occurrence of a flow is recorded in a flow cache • Cache entries are timestamped • Number of packets and bytes matching the flow are tallied • Details like next hop IP, ASN, subnet masks, and TCP flags can be recorded • Cache can be queries interactively, or flows can be exported 47



Setting up Netflow & IPFIX • Cisco – Netflow Configuration • Juniper – Monitoring, Sampling … • Huawei – Netstream Configuration • Mikrotik - IP Traffic Flow



48



24



7/11/19



Flow Sampling / Downsampling • Tracking every flow can take a lot of device resources • Some routers & switches can be crippled by turning on Netflow • Sampling helps by tracking one in n packets • CPU load can be significantly reduced – but so can resolution



49



Tools - Softflowd • Software Flow Monitoring • Passive Netflow collector • Network traffic passing through a switch can be mirrored • Attach a Unix computer to the mirrored port • Softflowd tracks flows from the mirrored traffic • Flows can be exported just as they are from routers & switches



50



25



7/11/19



Ad-Hoc Flow Queries • Cisco show ip flow • JunOS show services accounting flow-detail



51



Tools – nfdump + nfsen • Nfdump collects and processes netflow and sflow – C application that receives flows & logs them to files



• Nfsen generates stats and displays graphs – Web-based front-end to Nfdump



https://github.com/phaag/nfdump http://nfsen.sourceforge.net/ 52



26



7/11/19



Tools – nfdump + nfsen



53



Tools - ntopng • Web-based traffic and security network monitoring tool



https://github.com/ntop/ntopng 54



27



7/11/19



Module 6



LOG MANAGEMENT



55



What generates logs? • Operating Systems – Linux, Mac, Windows



• System applications – Cron, init, rdbms



• Network applications – Bgp, dhcp, http, iptables …



56



28



7/11/19



What do servers log? • Backups • Connections • Database messages • Hardware messages • Software versions and updates



57



What do Network Apps log? • Connections • DHCP details • Hardware messages • Port events • Protocol information



58



29



7/11/19



Where are logs stored? • Linux/Mac : /var/log • Windows: Event Viewer • Network devices: Memory Is it useful to have logs stored all over the place? What happens to events written to memory when devices are turned off?



59



Firewall Log



60



30



7/11/19



Syslog Message Levels Level



Description



0



Emerg



1



Alert



2



Critical



3



Error



4



Warning



5



Notice



6



Info



7



Debug



61



Syslog aggregation



62



31



7/11/19



How to aggregate syslog • Set up a remote syslog facility on a server – – – – –



Graylog Elastic Stack Rsyslog Splunk Syslog-ng



• Configure devices to send their logs



63



Tools - Graylog • Commercial + Open source software • Collection, Storage, Analysis, & Visualisation • Tightly coupled software stack including: – ElasticSearch for Search – MongoDB for log storage



• LibreNMS integration



64



32



7/11/19



Tools – Elastic Stack • Open source with commercial support available • Collection via Logstash • ElasticSearch for Storage and Search • Kibana for Search, Analytics, and Visualisation • (ELK stack)



65



Tools - Rsyslog • Open source with commercial support available • TCP, SSL, TLS, RELP • MySQL, PostgreSQL, Oracle and more • Filter any part of syslog message • Multi-threading and suitable for relay chains



66



33



7/11/19



Tools - Splunk • Commercial software • Free for small users at < 500 mb/day • Collection, Storage, Analysis & visualization • Real-time alerting engine included • Popular corporate solution with 13k customers



67



Tools – Syslog-ng • Free and open source with commercial support available • Collection and storage • Adds TCP and TLS to basic UDP transport • Can extract structured information from log messages • Can log directly to a database • Requires external tools for Analysis and visualization



68



34



7/11/19



Log Alerting & Analysis • No systems administrator has time to read all logs • Log messages are unimportant until they aren’t – Post-incident security reports – Billing inquiries – Law Enforcement Agency request



• Some platforms include analysis or alerting • Others need external tools like Tenshi or Swatch



69



Beyond Alerting: Analysis • Volume of log entries is as important as entries – What’s your baseline number of entries? – Has it changed? – Do more log entries mean an attack?



• Similar log entries across a network can be important – Port scanning, intrusion attempts



• Similar log entries across time can be important – Is someone attacking you very slowly?



70



35



7/11/19



7171



36