Malware Presence Confirmation using Wireshark and Scapy (PhantomStealer Analysis)

 

🟦 Introduction

Malware analysis is the process of identifying and understanding malicious software behavior in a network. Network traffic analysis plays a crucial role in detecting hidden threats and suspicious activities.

In this project, a PCAP file containing PhantomStealer malware traffic is analyzed using Wireshark and Scapy tools. The objective is to identify malicious communication patterns, suspicious domains, and data exfiltration activities.

This analysis helps in understanding how malware interacts with external servers and how such threats can be detected in real-world scenarios.

🟦 Objectives

  • To analyze malware traffic using Wireshark
  • To identify suspicious IP addresses and domains
  • To detect abnormal communication patterns in network traffic
  • To visualize traffic behavior using graph

    🟦 PCAP Source Link:
    https://www.malware-traffic-analysis.net/2026/01/30/index.html

    🟦 Architecture of the System



    The infected system generates network traffic which is captured in a PCAP file. The traffic is analyzed using Wireshark and Scapy to identify malicious behavior, suspicious communication, and anomalies.

    This architecture ensures a structured approach to malware detection using both manual inspection and automated analysis.

    🟦 Procedure

    1. Downloaded the PhantomStealer PCAP file from Malware Traffic Analysis website
    2. Opened the PCAP file in Wireshark
    3. Identified the victim IP using Statistics → Conversations → IPv4
    4. Applied filters:
      • dns
      • http
      • tcp
      • ip.addr == 10.1.30.101
    5. Analyzed suspicious domains and communication patterns
    6. Generated graphs using Wireshark (I/O Graph)
    7. Used Scapy (Python) for advanced analysis and graph generation
    8. Exported all graphs as images (not screenshots)

    🟦 Victim Identification

    The IP address 10.1.30.101 was identified as the infected system. This IP appears in all major communications and shows the highest number of packets exchanged with external servers.

    A large amount of traffic (6984 packets and 10 MB data) was observed between this IP and 185.27.134.154, indicating suspicious behavior.

    This confirms that 10.1.30.101 is the compromised host.

    🟦 Important Observation

    The IP address 185.27.134.154 shows the highest communication with the victim system.

    • Maximum packets observed: 6984
    • Maximum data transfer: 10 MB

    This strongly indicates that the IP is likely acting as a Command & Control (C2) server or data exfiltration server.

    🟦 Inferences (Proof of Malware Presence)

    🔹 Wireshark Analysis



    Inference 1: Overall Network Traffic Behavior
    • Sudden spike indicates abnormal activity
    • Burst-like pattern suggests automated malware behavior




    Inference 2: TCP Traffic and Errors
    • High TCP spikes indicate heavy communication
    • TCP errors suggest unstable malicious connections




    Inference 3: DNS Traffic
    • DNS spikes indicate multiple domain requests
    • Suggests malware locating external servers




    Inference 4: HTTP Traffic
    • HTTP bursts indicate data exchange
    • Suggests malware communication with server




    Inference 5: Data Transfer
    • Large data spikes indicate heavy transfer
    • Suggests possible data exfiltration


    🟦 Advanced Analysis using Scapy


    Inference 6: Protocol Distribution Analysis

    For deeper analysis, Scapy was used to generate additional graphs and insights.

    • The graph shows the distribution of network protocols used in the captured traffic.
    • It is observed that TCP traffic dominates the communication, accounting for approximately 99.9% of total packets.
    • UDP traffic is negligible, indicating that most communication is connection-oriented.
    • Malware commonly uses TCP protocol for reliable communication with command-and-control servers.
    • This dominance of TCP strongly suggests structured and persistent communication initiated by the malware.


    Inference 7: Source IP Activity Analysis

    • The graph represents the frequency of packets sent by different source IP addresses.
    • It is clearly observed that the IP address 185.27.134.154 has the highest number of packets compared to others.
    • This unusually high activity indicates that this IP is heavily involved in the communication process.
    • The victim system 10.1.30.101 also appears but with comparatively lower packet count.
    • The dominance of a single external IP suggests it may be acting as a command-and-control (C2) server or data receiver in the malware communication.



    Inference 8: Destination IP Activity Analysis

    • The graph shows the distribution of packets received by different destination IP addresses.
    • It is observed that the IP address 10.1.30.101 receives the highest number of packets.
    • This confirms that the system is the primary target of incoming network communication.
    • External IP addresses such as 185.27.134.154 also appear as destinations, indicating bidirectional communication.
    • This pattern suggests that the infected system is actively communicating with external malicious servers, supporting the presence of malware activity.


    Inference 9: Packet Size Distribution Analysis

    • The graph shows the distribution of packet sizes observed in the captured network traffic.
    • A large number of packets are concentrated in the higher size range (around 1300–1500 bytes).
    • This indicates that most packets are carrying substantial amounts of data rather than just control information.
    • Smaller packets are also present, which likely represent control or signaling messages.
    • The combination of large and small packets suggests structured communication, typical of malware performing data exfiltration along with control operations.



    Inference 10: TCP Flags Analysis

    • The graph represents the distribution of TCP flags observed in the network traffic.
    • A large number of packets contain the ACK (Acknowledgement) flag, indicating continuous and established communication sessions.
    • The presence of PA (Push + Acknowledgement) flags suggests active data transfer between the infected system and external servers.
    • Very few connection initiation flags (such as SYN) are observed, indicating that most connections are already established and maintained.
    • This pattern reflects persistent and stable communication, which is typical of malware maintaining a connection with a command-and-control (C2) server.



    Inference 11: Packet Frequency over Time Analysis

    • The graph illustrates the number of packets transmitted over different time intervals.
    • A significant spike is observed in the early stage, followed by another smaller spike later in the timeline.
    • These bursts of high packet activity indicate sudden and concentrated communication events.
    • Such irregular traffic patterns are not typical of normal user behavior and suggest automated processes.
    • This behavior is consistent with malware generating bursts of traffic for data transmission or communication with command-and-control servers.



    Inference 12: Unique IP Growth Analysis

    • The graph represents the increase in the number of unique IP addresses observed over time in the network traffic.
    • Initially, there is a rapid increase in unique IPs, followed by periods where the count remains constant.
    • This indicates that the infected system quickly establishes communication with a few external hosts and then maintains stable connections.
    • The step-like growth pattern suggests that new connections are introduced in phases rather than continuously.
    • Such behavior is typical of malware, where it connects to specific command-and-control servers and maintains persistent communication.



    Inference 13: TCP vs UDP Traffic Analysis

    • The graph compares the number of TCP and UDP packets in the captured network traffic.
    • It is observed that all communication is carried out using the TCP protocol, with no UDP traffic present.
    • This indicates that the communication is connection-oriented and ensures reliable data transmission.
    • Malware often prefers TCP protocol for maintaining stable and continuous communication with command-and-control servers.
    • The absence of UDP traffic further confirms that the communication is structured and controlled rather than random or broadcast-based.



    Inference 14: Packet Inter-Arrival Time Analysis

    • The graph shows the time delay between consecutive packets in the network traffic.
    • Irregular variations in inter-arrival times are observed, with sudden spikes and drops.
    • Such inconsistent timing is not typical of normal user-driven traffic, which tends to be smoother.
    • The presence of rapid bursts followed by delays indicates automated communication patterns.
    • This behavior is characteristic of malware, where packets are sent in bursts for data transmission or command exchange.



    Inference 15: Packet Size Variation over Time

    • The graph illustrates the variation of packet sizes with respect to time.
    • A wide distribution of packet sizes is observed throughout the timeline.
    • Larger packets appear in clusters, indicating bursts of data transmission.
    • Smaller packets are also present, representing control or signaling messages.
    • This combination suggests coordinated communication involving both control instructions and data transfer, which is typical behavior of malware.


    Inference 16: Incoming vs Outgoing Traffic Analysis

    • The graph compares the number of incoming and outgoing packets for the infected system.
    • It is clearly observed that incoming traffic is significantly higher than outgoing traffic.
    • This indicates that the infected system is receiving a large amount of data from external sources.
    • Such behavior suggests that the system is actively communicating with external servers, possibly receiving commands or additional payloads.
    • The imbalance in traffic confirms abnormal communication patterns, which are commonly associated with malware activity.



    Inference 17: Traffic Variability Analysis

    • The graph represents the variation in network traffic over time using a moving average of packet sizes.
    • It is observed that the traffic remains relatively high and consistent for most of the duration, indicating continuous data exchange.
    • Sudden drops and fluctuations are also visible, suggesting intermittent interruptions or changes in communication patterns.
    • Such variability indicates non-uniform traffic behavior, which is not typical of normal user activity.
    • This pattern is characteristic of malware, where data transmission occurs in bursts along with periods of reduced activity.



     Inference 18: IP Frequency Distribution Analysis

     

    • The graph represents the proportion of network traffic associated with different IP addresses.
    • It is clearly observed that the IP address 185.27.134.154 occupies the largest portion of the traffic.
    • This dominance indicates that the majority of communication is directed towards or originates from this external IP.
    • Other IP addresses contribute only a small fraction, highlighting an imbalance in communication distribution.
    • Such a pattern strongly suggests that the external IP is acting as a primary command-and-control (C2) server or data collection endpoint in the malware activity.



    Inference 19: Flow Duration Analysis

     

    • The graph represents the duration of communication flows between different IP pairs.
    • It is observed that most flows have very short durations, while a few flows last significantly longer.
    • The presence of longer-duration flows indicates sustained communication between specific IP addresses.
    • Such persistent connections are not typical of normal user behavior and suggest continuous interaction with external servers.
    • This pattern is characteristic of malware maintaining long-lived sessions with command-and-control (C2) servers for data exchange and control.



    Inference 20: Top Communication Pairs Analysis

    • The graph shows the most active communication pairs between source and destination IP addresses.
    • It is clearly observed that the communication between 185.27.134.154 and 10.1.30.101 dominates the traffic.
    • This indicates that the infected system is primarily interacting with a specific external IP address.
    • Such concentrated communication is unusual in normal network behavior, where traffic is typically distributed across multiple endpoints.
    • This strongly suggests that the external IP is acting as a command-and-control (C2) server or data exfiltration endpoint in the malware activity.

     

    🟦 Effects of Malware

    • Data Theft: Sensitive information can be stolen
    • Unauthorized Access: Attackers gain system control
    • Performance Degradation: System becomes slow
    • Network Congestion: Excess traffic affects network
    • Privacy Violation: User data gets exposed

    🟦 New Findings

    • High communication with a specific external IP
    • Repeated DNS queries to suspicious domains
    • Abnormal traffic spikes observed
    • Continuous TCP sessions detected

    🟦 Use of AI

    AI tools such as ChatGPT ,Claude.ai, Draw.io   were used to assist in analyzing network patterns, generating insights, and structuring the documentation. AI helped in improving clarity and identifying suspicious behaviors.


    🟦 Conclusion

    The analysis successfully identified malicious activity in the network traffic. The infected system was detected along with suspicious communication patterns and external connections.

    The use of Wireshark and Scapy provided deep insights into malware behavior. This highlights the importance of network monitoring in detecting cyber threats.


    🟦 YouTube Video

    https://youtu.be/P_eFz2S4Bcs


    🟦 GitHub Repository

    https://github.com/Sahil689172/Malware-Analysis-Phantom-Stealer


    🟦 References

    1. Malware Traffic Analysis Website
    2. Wireshark Documentation
    3. Cybersecurity Resources

    🟦 Acknowledgement

    I would like to express my sincere gratitude to the School of Computer Science and Engineering (SCOPE), Vellore Institute of Technology, Chennai for offering the Computer Networks theory and laboratory courses during the Winter Semester 2025–2026 with an industry-standard and industry-relevant curriculum.

    I would like to extend my heartfelt thanks to my course faculty, Dr. T. Subbulakshmi, Professor, SCOPE, VIT Chennai, for her continuous guidance, support, and valuable insights throughout the completion of this assignment.

    I would also like to acknowledge Gerald Combs, the founder of Wireshark and recipient of the ACM Software System Award (2018), for providing an excellent tool that enabled effective network traffic analysis.

    I would like to express my gratitude to Bradley Duncan for creating insightful blogs on malware analysis, which helped in understanding the behavior and impact of malware without executing it, and guided further detailed analysis.

    I extend my appreciation to all online resources, research materials, and references that contributed to enhancing my knowledge and supporting the successful completion of this blog.

    I would like to thank my peers and friends for their valuable suggestions, discussions, and support during the preparation of this assignment.

    I am deeply grateful to my parents, family members, and well-wishers for their constant encouragement, motivation, and support.

    Author

    Sahil Poply, II year B.Tech. CSE student, School of Computer Science and Engineering, VIT Chennai


Comments