Building a tech profile of a company

Netlas.io
9 min readMay 29, 2024

--

Intro

This article is a kind of logical continuation of the previous one, dedicated to OSINT research of companies. It has already touched on some of the topics covered here, such as collecting tags. However, in the new article, we will go further by studying specifically the methods for constructing a technology stack.

It is also worth mentioning that the result of writing two articles was a script that collects contact information, geographic data, technologies used by the company, and much more. A link to it will be given at the end of the article, and some code fragments will be discussed in separate paragraphs.

The investigation begins

ASD Tool

The first thing you must do is to directly build the perimeter of the organization under study. The easiest way to do this is by using the Attack Surface Discovery Tool in the Netlas.io application. You can read more about this tool here.

I will use SpaceX as the target company for this article. The graph I constructed is available at the link.

Next, the graph was downloaded in text file format. Its approximate content looks like this:

Now, having a scope, we can begin to build a technology stack of the company.

Getting provider

The first thing we will do as part of the research is to identify the provider that hosts the IP address or domain. To do this, we will use the IP/Domain info tool in Netlas.

Enter the address or name we are interested in into the search bar. Let it be one of the IPs we discovered.

Here we turn to the “Organization” item. This is where data about the provider is stored.

This can also be done through a script as part of intelligence automation. Here’s an example in Python:

 def getProvider(host, netlas_connection):
result = netlas_connection.host(host, fields=None, exclude_fields=False)
provider = result['organization']

if provider in providersList:
pass
else:
providersList.append(provider)

Here a request is sent to Netlas using the Host method, after which the required field is extracted from the received result. It is then added to the list of results for subsequent output.

Getting WHOIS info

The next stage of building a tech profile will be collecting information about organizations that provide targeted domain name/IP address registration services.

To do this, you need to use Netlas tools such as IP Whois Search and Domain Whois Search.

Let’s start with domain research, using the root domain of this study, spacex.com. Enter it into the search bar of the Domain Whois Search tool. We get the following result:

We are interested in the Registrator tab, the Name field. This is where the company that provides the registration of the requested domain name will be located.

This is what it looks like in the script:

def whoisRegistrarDomain(host, netlas_connection):
sQuery = "domain:" + host

cnt_of_res = netlas_connection.count(query=sQuery, datatype='whois-domain')
time.sleep(1)

if cnt_of_res['count'] > 0:
downloaded_query = netlas_connection.download(query=sQuery, datatype='whois-domain', size=cnt_of_res['count'])
else:
return

for query_res in downloaded_query:
data = json.loads(query_res)['data']
registrar = data['registrar']['name']

if registrar in registrarsList:
pass
else:
registrarsList.append(registrar)

In this code fragment, a query is generated, and the number of results for it is obtained. If there is at least one result, they are all downloaded, and the provider name is extracted from them, which is subsequently placed in the list for summary output.

Next is parsing the IP WHOIS information. To obtain it, you need to use the appropriate tool and enter a query with the address of interest into the search. As an example, I will take another IP from the scope.

Pay attention to the Description tab, Registry field. This is where the registrar we are looking for is located. In this case, it is ARIN.

Here’s how this search can be implemented in code:

def whoisRegistryIP(host, netlas_connection):
sQuery = "ip:" + host
downloaded_query = netlas_connection.download(query=sQuery, datatype='whois-ip', size=1)

for query_res in downloaded_query:
data = json.loads(query_res)['data']
registry = data['asn']['registry'].upper()

if registry in registriesList:
pass
else:
registriesList.append(registry)

This part of the script is similar to the previous one, except that the number of results is not checked. Since Netlas scans all existing IP addresses, there cannot be a situation where the download method does not download anything.

Getting Responses information

After this, it is necessary to examine the responses that can be found using the constructed scope. The Responses Search Tool is used for this.

Here we will look for things like the technologies used by the target company, its geographical location, and associated emails.

Let’s go to the desired tool and create a request like:

host:hostName

where hostName is the IP address or domain name we are interested in.

I’m using shop.spacex.com as an example.

In the Description tab, you can already see geographic information. Next, under the response there is a list of tags detected by Netlas — these are applications or technologies used on the host:

To study contacts, you just need to open the corresponding tab.

So how can this be implemented in code?

def getResponseInfo(host, netlas_connection):
sQuery = "host:" + host
cnt_of_res = netlas_connection.count(query=sQuery, datatype='response')
time.sleep(1)

if cnt_of_res['count'] > 0:
downloaded_query = netlas_connection.download(query=sQuery, datatype='response', size=cnt_of_res['count'])
for query_res in downloaded_query:
data = json.loads(query_res)['data']

try:
tags = data['tag']
for tag in tags:
tagName = tag['fullname']
tagCategory = tag['category'][0]

if tagCategory in tagsDict:

if tagName in tagsDict[tagCategory]:
pass
else:
tagsDict[tagCategory].append(tagName)

else:
tagsDict[tagCategory] = []
tagsDict[tagCategory].append(tagName)
except:
pass

try:
city = data['geo']['city']
country = data['geo']['country']

if country in geoDict:

if city in geoDict[country]:
pass
else:
geoDict[country].append(city)

else:
geoDict[country] = []
geoDict[country].append(city)
except:
pass

This function is the most complex in the script due to the number of fields that need to be processed. Also, as in the previous fragments, the number of results is obtained. If it is non-zero, the data is downloaded and parsed. Since responses do not always have a fixed number of fields, the moments of receiving data are wrapped in try-except. So, if there is no “tag” field, the script will easily continue its work. I recommend this method to everyone who plans to automate work with Netlas.

In this case, tags and geographic information are placed in the appropriate dictionaries for a more convenient Summary output at the end of the script.

Getting services

Finally, the last type of information collected is the services used by the target company. I’ll take Zendesk as an example.

To do this search, you need to use the DNS Tool by composing a query like:

domain:spacex.zendesk.*

This will return the following results:

The main purpose of this search is to make sure that the company uses a particular service. Of course, you can try to find them all using a query like this:

domain:spacex.*.*

However, in this case, there is a very high chance of a false positive because not all results will relate to SpaceX.

The script implementation is quite simple:

def getServices(netlas_connection):
if servicesFileName != 'None':
servicesFile = open(servicesFileName, "r")
else:
return

print("Print brand name (like an apple, microsoft etc.): ")
name = input()

while True:
line = servicesFile.readline()

if not line:
break

line = line.replace("\n", "")

sQuery = "domain:" + name + ".*" + line + ".*"

cnt_of_res = netlas_connection.count(query=sQuery, datatype='domain')
time.sleep(1)

if cnt_of_res['count'] > 0:
servicesList.append(line)

The script, which previously received a file as input that lists the services the user is interested in, tries to detect them in conjunction with the target brand name.

Run the script

Now that the program is completely ready, it’s time to test it. I run the script using the perimeter downloaded from ASD in the first step as input. The results are shown in the following images.

Full script listing

import netlas
import json
import time
import re
import ipaddress
import argparse
import sys

servicesList = []
providersList = []
registriesList = []
registrarsList = []

geoDict = {}
tagsDict = {}

api_key = 'yourKey'
servicesFileName = ''

def createParser ():
parser = argparse.ArgumentParser()
parser.add_argument('-i', '--input', help='Path to input file')
parser.add_argument('-k', '--key', default='None', help='Your Netlas.io API key')
parser.add_argument('-s', '--services', default='None', help='Path to file with service names')

return parser

def getProvider(host, netlas_connection):
result = netlas_connection.host(host, fields=None, exclude_fields=False)
provider = result['organization']

if provider in providersList:
pass
else:
providersList.append(provider)

def getResponseInfo(host, netlas_connection):
sQuery = "host:" + host
cnt_of_res = netlas_connection.count(query=sQuery, datatype='response')
time.sleep(1)

if cnt_of_res['count'] > 0:
downloaded_query = netlas_connection.download(query=sQuery, datatype='response', size=cnt_of_res['count'])
for query_res in downloaded_query:
data = json.loads(query_res)['data']

try:
tags = data['tag']
for tag in tags:
tagName = tag['fullname']
tagCategory = tag['category'][0]

if tagCategory in tagsDict:

if tagName in tagsDict[tagCategory]:
pass
else:
tagsDict[tagCategory].append(tagName)

else:
tagsDict[tagCategory] = []
tagsDict[tagCategory].append(tagName)
except:
pass

try:
city = data['geo']['city']
country = data['geo']['country']

if country in geoDict:

if city in geoDict[country]:
pass
else:
geoDict[country].append(city)

else:
geoDict[country] = []
geoDict[country].append(city)
except:
pass

def whoisRegistryIP(host, netlas_connection):
sQuery = "ip:" + host

downloaded_query = netlas_connection.download(query=sQuery, datatype='whois-ip', size=1)

for query_res in downloaded_query:
data = json.loads(query_res)['data']
registry = data['asn']['registry'].upper()

if registry in registriesList:
pass
else:
registriesList.append(registry)

def whoisRegistrarDomain(host, netlas_connection):
sQuery = "domain:" + host

cnt_of_res = netlas_connection.count(query=sQuery, datatype='whois-domain')
time.sleep(1)

if cnt_of_res['count'] > 0:
downloaded_query = netlas_connection.download(query=sQuery, datatype='whois-domain', size=cnt_of_res['count'])
else:
return

for query_res in downloaded_query:
data = json.loads(query_res)['data']
registrar = data['registrar']['name']

if registrar in registrarsList:
pass
else:
registrarsList.append(registrar)

def getServices(netlas_connection):
if servicesFileName != 'None':
servicesFile = open(servicesFileName, "r")
else:
return

print("Print brand name (like an apple, microsoft etc.): ")
name = input()

while True:
line = servicesFile.readline()

if not line:
break

line = line.replace("\n", "")

sQuery = "domain:" + name + ".*" + line + ".*"

cnt_of_res = netlas_connection.count(query=sQuery, datatype='domain')
time.sleep(1)

if cnt_of_res['count'] > 0:
servicesList.append(line)

def cidrPreparing(string, netlas_connection):
lastIP = ""

ips = ipaddress.ip_network(string)

sQuery = "ip:["+ips[0].compressed + " TO " + ips.broadcast_address.compressed + "]"

cnt_of_res = netlas_connection.count(query=sQuery, datatype='whois-ip')
time.sleep(1)

if cnt_of_res['count'] > 0:
downloaded_query = netlas_connection.download(query=sQuery, datatype='whois-ip', size=cnt_of_res['count'])
else:
return

for query_res in downloaded_query:
data = json.loads(query_res)['data']

ip = data['net']['start_ip']

if ip == lastIP:
continue
else:
functionHub(ip, netlas_connection)
lastIP = ip

def functionHub(string, netlas_connection):
reg = re.match(r'([0-9]{1,3}[\.]){3}[0-9]{1,3}', string, 0)

if reg:
whoisRegistryIP(string, netlas_connection)
time.sleep(1)
getProvider(string, netlas_connection)
else:
whoisRegistrarDomain(string, netlas_connection)

time.sleep(1)
getResponseInfo(string, netlas_connection)

def printResults():
print("== Summary ==")

if not tagsDict:
pass
else:
print("-- Using applications --")
for category in tagsDict.keys():
print(category + ": ")
print(*tagsDict[category], sep="\n")
print("")
print("\n")

if not geoDict:
pass
else:
print("-- Geo information --")
for country in geoDict.keys():
print(country)
print(*geoDict[country], sep="\n")
print("")
print("\n")

if not servicesList:
pass
else:
print("-- Using services --")
print(*servicesList, sep="\n")
print("\n")

if not providersList:
pass
else:
print("-- Providers --")
print(*providersList, sep="\n")
print("\n")

if not registrarsList:
pass
else:
print("-- Registrars --")
print(*registrarsList, sep="\n")
print("\n")

if not registriesList:
pass
else:
print("-- Registries --")
print(*registriesList, sep="\n")
print("\n")

if __name__ == '__main__':
parser = createParser()
namespace = parser.parse_args(sys.argv[1:])

if namespace.key != 'None':
api_key = namespace.key

inputFileName = namespace.input
servicesFileName = namespace.services

netlas_connection = netlas.Netlas(api_key=api_key)

inputFile = open(inputFileName, "r")

while True:
line = inputFile.readline()

if not line:
break

line = line.replace("\n", "")

if line.find("/") != -1:
cidrPreparing(line, netlas_connection)
else:
functionHub(line, netlas_connection)

inputFile.close()

getServices(netlas_connection)

printResults()

wait = input()

You can also find this and other Netlas scripts in our GitHub repo.

Note: It is important to consider that some Netlas features (for example, contact search) may not be available to some users due to subscription restrictions. You can get complete information about the features of different subscription plans here.

Conclusion

This article completes the topic of OSINT business research using Netlas. This time, greater emphasis was placed directly on the company’s technology stack. Using the methods described here, together with the methods from the previous article, you can quite effectively conduct primary reconnaissance of the company you are interested in, manually or semi-automatically.

Good luck!

--

--

Netlas.io

Discover, research and monitor any assets available online