Jump to content

Firewall Log Fun


Pic0o

Recommended Posts

This thread is ongoing, but let me start with the results I have from a year worth of dropped firewall connections.

  • 228376

    January 2016

  • 253698

    February 2016

  • 244374

    March 2016

  • 494842

    April 2016

  • 611021

    May 2016

  • 259013

    June 2016

  • 529243

    July 2016

  • 406937

    August 2016

  • 2096766

    September

  • 264421

    October

Let's jump back a minute. I am importing firewall logs for dropped connections into a MS SQL Database. September as you can see is a fun month with 2,096,766 records.

Since my firewall is a Zyxel device, I gave a look at the .csv delimited log output. Easily enough you can use a Data Import Wizard to spin the logs into some tables. Rough table to log structure is as such:

CREATE TABLE zy_2016-09 (
  time VARCHAR(50) NULL,
  source VARCHAR(50) NULL,
  destination VARCHAR(50) NULL,
  priority VARCHAR(50) NULL,
  category VARCHAR(50) NULL,
  note VARCHAR(50) NULL,
  sour_interface VARCHAR(50) NULL,
  dest_interface VARCHAR(50) NULL,
  protocol VARCHAR(50) NULL,
  message VARCHAR(250) NULL,
  col00 VARCHAR(250) NULL,

I am having fun crawling some output. Typically it's some sort of fancy OpSec to not say your type of network gear, but this is meant to be informative and hopefully helpful.

So let's crawl some queries and output in the next post.

Link to comment

I am in the middle of crunching data at the moment. Since we have some records, I am starting with the gigantic table for obvious trends in the dropped connections.

 

I didn't have good criteria yet, so I went with ordering the output by Source, in Descending order. Giving the old scroll observation, I saw quite a few connections on port :7759.

2016-09-19 16:21:36 99.98.xx.xx:7759 xxx.xxx.xxx.xxx:7759 notice firewall ACCESS BLOCK wan1 udp Match default rule DROP

I figured I'll filter the IP result for the moment, with the xxx.xxx address being to my web IP at the time. Yay it was dropped. So that's a nice log to see.

 

Jumping back, I started with normal phishing ports but they were not yet as interesting as the following query.

SELECT * FROM [dbo].[zy2016-09]
WHERE [source] like '%:7759'
ORDER BY [source] desc

975340 rows returned. Out of 2096766 rows. That was a big old flag for just looking into the 99.xx range by a output sort.

 

I'm going back to reviewing results and trending them. I did skip some details for procuring the logs from your device, setting the firewall up to log to a USB device, and so on. Manually reviewing logs sucks and being able to do queries with COUNT(*) for total, is nice to know what you might be dealing with. Always remember the internet is being port scanned, so trying to log some of it hitting you, let's you see the weather online.

 

Jolly Weekend on the upcoming Halloween season. :yar:

Link to comment

Let's get some Python involved. Manually writing queries is going to be a pain and take a ton of time.

#Dump Queries with start to end port range

import sys

pstartNum = int(input("Source Port number start: "))
q1 = "SELECT COUNT(*) AS Port_"
q2 = "FROM [dbo].[zy2016-09]"
qWat = "WHERE [Column 1] like '%:"
qClose = "'"
for x in range(100):
    sys.stdout.write("%s%s %s \n" % (q1, pstartNum, q2));
    sys.stdout.write("%s%s%s \n" % (qWat, pstartNum, qClose));
    print("GO")
    print()
    pstartNum = pstartNum - 1
exit  

While I'm not a python wizard, this code works. What it does is start from a user-inputted port number and runs a loop 100 times, subtracting 1 for each iteration.

Taking a description of why I am using sys.stdout.write, is to control the spacing on string output, so I have functional SQL queries. If you try to do this with the print() operator, you will get spaces in your output that will break your SQL Query.

I was trying to find a way to handle this in print, but I was fighting against how string interpolation and output control in Python works. Ending this part with what a pair from the 100 loop looks like

 

Output:

SELECT COUNT(*) AS Port_8888 FROM [dbo].[zy2016-09] 
WHERE [Column 1] like '%:8888' 
GO

SELECT COUNT(*) AS Port_8887 FROM [dbo].[zy2016-09] 
WHERE [Column 1] like '%:8887' 
GO

I defined the q variables as parts of the SQL query, instead of kludging together a nasty looking sys.stdout.write line, while keeping it more easy to edit and maintain.

 

By adding 'GO' to the 3rd line, I can have the other queries run instead of not getting results until all 100 queries complete. For some samples of SQL code, check out this MS SQL usage thread I have up here.

Link to comment

Debugging the SQL, here is the breakdown of the output above.

SELECT COUNT(*) AS Port_8887 FROM [dbo].[zy2016-09]

WHERE [Column 1] like '%:8887'

GO

SELECT COUNT(*) AS Port_8887

This does a count for the matched records and displays it to your results. If you want to see the record results, replace COUNT(*) with *

 

 

FROM [dbo].[zy2016-09]

Here is the name of the table you imported from your firewall log. In my case, it's the name I gave to the exported firewall logs.

 

WHERE [Column 1] like '%:8887'

This is the search criteria for the entire table. In this case, :8887 represents the port number for [Columm 1]. Or as I defined in the 1st post source ip address.

 

GO

Ending with this line, says run these individually, instead of waiting for all 100 of your queries to finish.

 

 

Running this in Microsoft SQL Server Management Studio, from a New Query window, you will get similar output to below image.

PortCountOutput_Query.png

I am showing 2 example outputs, but the python script is doing 100 iterations.

Link to comment

Jumping back to source Log files.

 

Depending on your device, check the top of the file. You will likely have some column headers, and in my case I had a line of '======' characters. I find it easier to remove the header and 2nd line, then save the modified file as a new file.

 

Example log file header for Zyxel firewall:

Time ,Source ,Destination ,Priority ,Category ,Note ,Source Interface ,Destination Interface ,Protocol ,Message

========================================================================================================================================================================================================================================================================================

To save hassle on the import, I saved the modified files as zy(Year)-(Month) (IE: zy2016-09) as seen in the SQL code examples. These became my table names when I imported them into my database. By deleting the top two lines, the new file works as a clean csv to import.

Link to comment

What is the python doing?

 

I read a few books for python and read the online manual, but trying to format string output was driving me crazy. To save you some rage, my observations are that using print("text",variable,"rest of query line 1") will always force a space to be inserted. This will not fly in SQL queries and you don't want that hassle.

Trying to avoid vomit inducing code and searching around, I found that using stdout.write, lets you define the output spacing, followed by your variable values. It took me vastly longer than I would have liked to have figured this out, but considering generating database scripts was the main goal I had for learning python, I'm pretty excited this works. :)

 

The running python code is higher in the thread, but below I explain each line, with a # comment under each line.

 

Line for line breakdown to build the SQL Queries:

import sys
# import sys so we can use sys.stdout.write

pstartNum = int(input("Source Port number start: "))
# Asking to have the user (You) enter a start number.

q1 = "SELECT COUNT(*) AS Port_"
# Start of SQL Query, in this case we are getting a count, instead of a display of the values matching the query.

q2 = "FROM [dbo].[zy2016-09]"
# From portion of a SQL Query

qWat = "WHERE [Column 1] like '%:"
# Query for your search criteria.

qClose = "'"
# Close the SQL Query for the like string.

for x in range(100):
# Run this loop 100 times.

    sys.stdout.write("%s%s %s \n" % (q1, pstartNum, q2));
# The left side with %s placement, says grab the 1st part of the Select variable, Current iteration of loop, with a space added for the variable for 'FROM (tablename)'.

    sys.stdout.write("%s%s%s \n" % (qWat, pstartNum, qClose));
# Here the %s triple string placement is to have no spaces.  We call the Query, current iteration of the start number from the loop, and close the line with a "'".

    print("GO")
# Execute each query instead of waiting for all of them to complete.
    
    print()
# prints a blank line.  I saw some crazy code to do this by other means, but since all I want is a blank line, this is way easier and fits the bill.

    pstartNum = pstartNum - 1
# Once the 1st loop finishes, subtract 1 from the starting port number.  Do this for each iteration, by sayin that variable is equal to -1 from it's current state. (Please note you will error if you start with a port number less than 100.)

exit
# stops the 100 loop, so I can copy and paste this into my SQL query window and get results.

Remember this is a breakdown of the source code from the above 3rd post.

If you want this to run for 1000 iterations, change the for x in range(100) to for x in range(1000) or whatever works for you.

If you are not already doing this on your own equipment, you may anger some server admin with resource utilization. Or in the case of using Amazon Web Services or other hosting, your bill might get nasty expensive.

Link to comment
  • 2 months later...

Table structure and Imports revisited

In the case of dealing with Zyxel logs, we are better off leaving the 1st line for headers, but removing the line of '=' as shown in post #5. Once this line is removed, save the .log file and we can import the records into a table using MS SQL Import Wizard. This way, we know everything imported ok, and if it fails, most often it's because your input column has more characters than your Column defined in the new database table.

In the case of a Zyxel firewall log, this table structure should match the log format and import without error. Here are the columns to a supported character import length:

[Time]   varchar(50)
[Source]   varchar(50)
[Destination]   varchar(50)
[Priority]   varchar(50)
[Category]   varchar(50)
[Note]   varchar(250)
[Source Interface]   varchar(50)
[Destination Interface]   varchar(50)
[Protocol]   varchar(50)
[Message]   varchar(450)

 


There would be some spaces padded into the inserted column names, but at least it would be consistent to what your source data is. You can edit the Design of the table after the import, since changing the column names will not break the data sets. In queries the spaces on the end seem to be ignored, so edit at your preference.
When doing the import, you can click advanced on the Flat File Source portion of the import and in Advanced, you can edit the OutputColumnWidth to match the listed VarChar parameters.

Link to comment
  • 1 year later...

Slightly more robust Python, allowing you to add string for the table name in question when generating the queries.
Please keep in mind the way this is scripted, you will get negative values for the port if your starting number is less than 100.
 

#Dump Queries with start to end port range
	import sys
pstartNum = int(input("Source Port number start: "))
q1 = "SELECT COUNT(*) AS Port_"
qTbl = str(input("Table name for query build: "))
q2 = "FROM [dbo].[%s]" % (qTbl)
qWat = "WHERE [Column 1] like '%:"
qClose = "'"
for x in range(100):
    sys.stdout.write("%s%s %s \n" % (q1, pstartNum, q2));
    sys.stdout.write("%s%s%s \n" % (qWat, pstartNum, qClose));
    print("GO")
    print()
    pstartNum = pstartNum - 1
exit


I also attached an image for the sake of pretty color markup.

pySQLQueryBuilder.png

Link to comment

Testing the above code works great until I notice I broke my own formatting.
Since I'm importing the header names from the log flat file, [Column 1] is now [Source].  Looking a few posts up at the revised table structure, we can see the reason Column 1 lookups are failing is because the database fields are given actual descriptive labels thanks to the flat file we imported from.

Simply enough, I chose to edit the inline string for my qWat variable string.  I thought of making it an option at runtime but decided it would be more useful to branch out different python scripts for lookups on [Destination].  Same for making a script that shows all the results instead of the SELECT COUNT(*) call.

Same code below as above, except for where I changed the WHERE target on the qWat line.

#Dump Queries with start to end port range

import sys

pstartNum = int(input("Source Port number start: "))
q1 = "SELECT COUNT(*) AS Port_"
qTbl = str(input("Table name for query build: "))
q2 = "FROM [dbo].[%s]" % (qTbl)
qWat = "WHERE [Source] like '%:"
qClose = "'"
for x in range(100):
    sys.stdout.write("%s%s %s \n" % (q1, pstartNum, q2));
    sys.stdout.write("%s%s%s \n" % (qWat, pstartNum, qClose));
    print("GO")
    print()
    pstartNum = pstartNum - 1
exit

pySQLQueryBuilder001.png

Feel free to increase the range(100) to range(1000).  You can then paste the output python into a SQL Query window but be warned.  You might anger a DBA if this is running on something other than your workstation.

When you paste the output python SQL queries, just remove the >>> at the very end and the top 6 python version lines and your 2 string input questions.  You should start with a SELECT and end with a GO on the pasted python code, so it runs without error in a SQL Query window.

Link to comment

Show me the output record sets.
OK.  Now I would like to see what the actual records are for any port connections.  Modifying the previous python script, we are now going to make some queries that show any matched record sets in a query window.  I was kind of miffed why I kept getting my loop iteration number dumping in the select portion of the output, until I looked better at the 1st sys.stdout.write line.
Miraculously, it was still doing the part I told it to of printing pstartNum on the 1st line.  Once I cleaned that up, I had proper SELECT * formatting for the script build.
 

#Dump Queries with start to end port range

import sys

pstartNum = int(input("Source Port number start: "))
q1 = "SELECT *"
qTbl = str(input("Table name for query build: "))
q2 = "FROM [dbo].[%s]" % (qTbl)
qWat = "WHERE [Source] like '%:"
qClose = "'"
for x in range(100):
    sys.stdout.write("%s %s \n" % (q1, q2));
    sys.stdout.write("%s%s%s \n" % (qWat, pstartNum, qClose));
    print("GO")
    print()
    pstartNum = pstartNum - 1
exit



Thanks to this code, you can get subqueries to show results for any ports matched in your range.  Anything found will show a recordset, where as non-matched ports will just show headers with empty datasets.  If you decide to crank this up to 1000 or more iterations, be ready to wait and eat up some more CPU and Memory as it runs.
Try 500 iterations as your SQL Express query window may crash @ 1000 runs as mine just did. :bunny:

pySQLResultBuilder001.png

Most commonly I see, Source (Internet connections) trying to connect to Destination (my location) on port 23 or port 3389.  Telnet and Remote Desktop Protocol on 3389 are pretty heavily port sniffed for sure.  You can extrapolate if some of those connections might be a botnet if the Source port they are sending from seems especially common.  The world is your oyster.

For conversations sake, I had over 9000 attempts to connect to SSH on port 23 for the month of December 2017.  About 640 attempts to connect over RDP on port 3389.

Link to comment
×
×
  • Create New...