Network Monitoring with NetFlow – Moving Up To The Next Level

No network administrator argues with the need to understand his network traffic.

Understanding how the network is being used is the key to meet QoS norms. It is essential that we know what is really happening to our network and who is using what percentage of the available bandwidth.

When SNMP first appeared in 1988, it was a network management tool that provided a first level of visibility to network managers. Over the years the protocol has been improved and its key shortcomings such as a lack of security etc have been rectified in its later versions. However, compared to the changes in SNMP versions, much more has changed in the network usage over the last twenty years. Nowadays, a typical network supports far more than simple data packets and few applications. Once we add real time applications, VoIP, videoconferencing, multicast and unicast media streaming, VPNs, cloud computing, e-commerce, spam, trojan and virus induced traffic and so on, it becomes very clear that a next generation tool to understand data flow and manage networks would be required.

Being aware of link usage isn’t enough anymore. Administrators need to know this traffic composition, not only the total volume, to understand and be sure if it’s an expected traffic or if it’s an unexpected or undesired use of network.

This next generation tool is NetFlow (and its variations as netstreams low,ipfix, etc).

NetFlow is a technology that is an integral part of the Cisco IOS which looks as packets forming part of a flow as against merely counting them. A flow, as the name signifies, has a beginning, a body and an end. When data packets are grouped as flows, administrators are able to understand the applications that are using the network more comprehensively. This naturally allows for better management and an improved quality of service. Administrators are also able to recognize problem areas better and take pre-emptive action.

NetFlow has now been adopted by many other manufacturers besides CISCO since it has become a de-facto standard. Juniper, Extreme, Vanguard, Huawei and several other vendors incorporate it in their routers and switches similar functionality.

Network Awareness with NetFlow

NetFlow looks at IP flows rather than counting bytes at interfaces. A flow is a stream of IP packets that have the following seven identical fields:

  • A common source IP address
  • A common destination IP address
  • A common source port
  • A common destination port
  • Same layer 3 protocol
  • Same type of service
  • The same logical interface

Note: A flow is unidirectional and therefore, for each session, there will be two flows. As a web browser access an image in a web server, for example.

Based on these parameters, the IOS is able to identify a flow. This information is stored in the NetFlow cache and can be transferred to a collector for detailed analysis. A powerful NetFlow analysis implementation normally relies in external collector boarded with sophisticated statistical tools to supply network management with really useful information, since routes by itself has only restricted functionality to show raw data in cache flow and cacheflow are limited by router memory.

Each flow has a lifetime. If the flow has ceased to be active, it is expired after 15 seconds of inactivity. In case a flow is continuing, if its information is stored in the cache for 30 minutes, it is purged even if the traffic still continues. For connection oriented flows such as FTP / Telnet etc, the data is purged as soon as the session is closed.

Active flows can be analyzed by displaying the cache. This can allow a far deeper understanding of the activity on the network even without exporting and analyzing cache contents. If a collector device has been specified by the user, flow records are exported using UDP packets. Each UDP packet consists of one flow header and thirty flow records.

At the collecting station, a flow analyzer is required to process the exported flow data in real time. Flow analyzers can be either open source or commercial software / hardware systems.

Key Capabilities

Imaginative administrators can use NetFlow in many different ways to get valuable insights into their network. A few key uses are listed below and discussed in subsequent paragraphs.

  • Bandwidth Monitoring and Traffic Analysis
  • Network Forensics and Security Management
  • Application Monitoring
  • Tracking Application Migration
  • Validating QoS
  • Capacity Planning
  • Identify worms and malware
  • Analysis of VPN traffic and Teleworker behavior
  • Calculating total cost of ownership for applications

Bandwidth Monitoring and Traffic Analysis with NetFlow

With properly analyzed NetFlow data, a network administrator can really understand what his traffic is comprising of. It is extremely easy to find out which application or user is using large bandwidth volumes, which are the popular protocols, what are ingress and egress bandwidth utilization.

One can easily filter or aggregate the data in the NetFlow collector and generate very specific reports. Such operations can be based on a combination of the following criteria:

  • Source or destination IP addresses
  • Incoming or outgoing interfaces
  • Source or destination ports
  • Devices that traffic have crossed
  • Source or destination autonomous systems
  • Protocol used
  • Flags used during conversation
  • Type of service

The output of the filter can be further restricted to between specified dates and timings. This allows an extremely focused view of the network utilization and has enormous applications – including planning application use, forensics, incident monitoring and so on.

Network Forensics and Security Management

Network security is a desired end state comprising personnel and policies, intrusion detection, firewalls and network behavior and analysis. Among these, the first three ones are reasonably well met by framing policy and appropriate use of technology. Network behavior and anomaly detection is often missed out. This can make the network vulnerable to threats that have not yet been detected and classified.

If your network is critical to the company, proactive anomaly detection is essential. NetFlow has the tools to do that.

Although it may seem strange, most of networks in operation, even in large and well organized companies, show a lot of unknown applications sending data into and using bandwidth. Some of them are really important for company business and absolutely unknown by network administrators. It’s consequence of a cultural way of working of IT area that is not too much aware or concerned about post deployment traffic requirements of new applications.

Application Monitoring

Application monitoring capability is critical when most services are being delivered over the network. The advanced capability of NetFlow allows administrators to see bandwidth consumption based in any combination of ports, addresses, protocols, devices, interfaces, or type of services.

You can also group applications and map them to users, ip ranges or any other available parameter to build up different matrix of traffic to perfectly understand the real traffic crossing network and from what source to what destination it is.

All of this allows detailed knowledge about the usage of the resources that are being provided. Roadblocks become clear and applications that need further support are easily identified. Traffic to specific sites can be monitored by providing the port, protocol and IP address.

In many cases of managing mission critical applications such as ERP applications, videoconferencing, web based applications, instant message, SMTP and others, it is important to ensure that applications are getting the bandwidth they need to be able to function efficiently. Left to themselves, users would only be able to complain that pages take an eternity to open, forms do not appear or data input does not work. However, with NetFlow, it is a simple matter to make out whether the bottleneck is at the database or web servers or the network. It is also important to understand that bottlenecks can change or additional bottlenecks can emerge. Therefore, application management is an ongoing process. At a certain level of utilization, the existing database server would be adequate yet at a higher level of use, this could become a bottleneck.

Without this technology, a great deal of trial and error (and possibly unnecessary expenditure) would be required before satisfactory performance could be achieved, with network managers facing a high risk of failure in this trial. While this was happening, there could be losses due to users stooping to do their work. The worst situation being the network manager finding himself unable to explain or justify network costs clearly to company higher administration levels.

With NetFlow, guess work based on trial and error is totally eliminated.

Monitoring Application Thresholds Instead of Link Thresholds

In good NetFlow colletors, administrators can set thresholds for bandwidth utilization based on a single application traffic instead of total link usage. This will ensure that no application is starved for bandwidth and bandwidth hogs are corrected in time. This leads to a more reliable operating environment for all applications.

Tracking Application Migration

Capturing application migration statistics has been a very interesting use of NetFlow. In a situation where the company is migrating from one version of a software to another, and both are running concurrently, NetFlow can give an instantaneous display and detailed reports of how many users (and who) are still using the old application etc. This type of data comes in great use in migration control meetings and can make user compliance much easier.

Validating QoS

Achieving the planned QoS is in most of cases critical other way a QoS implementation is useless itself. However, in a dynamic scenario, an administrator needs tools that help documenting the QoS being achieved as well as help in troubleshooting. NetFlow is an ideal tool to do this.

Once the traffic is well understood, appropriate QoS statements can be written. In a teleworker environment, greater use of voice has been norm during daylight hours whereas more bandwidth is required for backup, database replication and similar housekeeping activities at night. These differing requirements can be clearly seen and understood using NetFlow and catered for.

Other occasions in which NetFlow is used to monitor QoS is on MPLS networks when router configuration cannot be simply examined to determine the QoS. In such cases, NetFlow provides a direct measure of the QoS being provided. Presented with conclusive NetFlow data, the correct cause of the bottleneck can be easily determined.

Using NetFlow for Capacity Planning

While many will say that SNMP can do this job so well, NetFlow is far more granular. It allows you to see who is using the network and how much is their bandwidth utilization. Then, even if the average bandwidth utilization is within limits, but if your critical application is not getting the bandwidth it needs, this can be pinpointed and used to convince management to improve bandwidth. Large companies also use this feature to generate department based bandwidth utilization figures for their internal billing and accounting purposes.

Identifying Worms and Malware

Many Denial of Service attacks can be detected and analyzed by NetFlow, and once it is understood, they can be firewalled or handled in other ways. While firewalls and IDS are the primary means of defense against such attacks, NetFlow is a valuable enhancement and since its data can be stored in a standard database, it can be used to analyze the event in detail later on.

In many other cases, comprised mail servers have been easily identified by selecting SMTP traffic and checking the numbers of flows. In this case, ISPs often issue warnings and then suspend accounts. This can be easily avoided by an intelligent use of NetFlow.

Analysis of VPN Traffic and Teleworker behavior

NetFlow has been very successfully used to discover the actual utilization of VPNs. A careful analysis clearly brings out the network utilization values and the VPNs can be allocated more (or lesser) bandwidth on an as required basis.

Identifying teleworker traffic is easy with NetFlow because all of it travels over standard routing encapsulation tunnels. Because it is easy to notice different types of traffic – voice, SMTP, HTTP and other applications, administrators can create more tightly focused QoS statements. Typical ones could be supporting more voice during day and more bandwidth for data backup at night.

Calculating Total Cost of Ownership of Application

Most enterprises calculate TCO before the application is released to large number of users. WAN costs are an important part of the TCO. A simple method is to test the application in a lab environment and evaluate the traffic generated by using NetFlow. This procedure gives a very accurate measure of the WAN traffic and helps avoid nasty surprises later on.

Some other specific examples where NetFlow has been used to calculate TCO:

  • Calculating the cost of deploying large numbers of IP security cameras to monitor remote locations.
  • Cisco itself has used NetFlow to calculate the cost benefits of deploying 50,000 IP phones at its offices all over the world as compared to the cost of using PSTN services.
  • Several enterprises have chosen to host instances of their content management system servers at locations close to users to minimize WAN traffic. This was based on a TCO study comparing a single remote location on the WAN versus several servers on LAN segments in different locations.
  • In many cases, NetFlow is used to divide the cost of network access between the various departments of an organization. It is a simple matter to calculate the ratio of traffic to different departments and calculate network access costs.

Is NetFlow of Use to Pure LAN Users?

The answer is ‘Yes’. Even if most of your work is within a LAN, if you are hosting complex and large applications, performance of the network will always be an issue. VoIP, video conferencing, media streaming, SMTP, database access, content management etc all have differing network usage characteristics and requirements. Recognizing these will help network managers provide better service. TCO calculations will only be possible if the actual traffic is known and malware and viruses can be as prevalent in LANs as they are elsewhere.

In Summary

IT and Network managers in most organizations are always under pressure to deliver mission critical applications and content across the entire organization. Quality of Service has become critical and the types of services being offered are only increasing. Under these circumstances, NetFlow has become a critical tool to ensure acceptable application performance and corporate governance.