Hacking Lexicon

It is not a "jargon file". This document clarifies many of the terms used within the context of information security.
This document clarifies many of the terms used within the context of information security (infosec). My goal is not to define/explain terms, but clarify key points and dispel misconceptions. It is not a "jargon file".

Source: http://www.robertgraham.com/pubs/hacking-dict.html
Version 0.4.0, August 21, 2000
Disclaimer: This document has many omisions and contains much that is apocryphal, or at least wildly inaccurate. This document does not define terms, but only clarifies what many people mean/imply when they use these terms in the context of information security. Feedback: Please send feedback to "hacking-dict@robertgraham.com". Tips: If you are trying to learn the lingo, I've tried to rate terms [1-5]; level one terms should be understood by beginners, level 4/5 terms are for experts who have no other life.

[ 0 | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z ]
.

- 0 -

[ 'bot | .plan | /dev/null | /dev/random | /etc | /etc/hosts | /etc/inetd.conf | /etc/passwd | /etc/services | /etc/shadow | 11 | 128-bit | 40-bit | 56-bit | 64-bit | 8 | 8-character password | 802.1q | ~user ]

128-bit [1]
Generally refers to strong (unbreakable) encryption. Web-browsers contain an option for 40-bit vs. 128-bit encryption. The United States only allows export of the weaker version in order to allow the government to spy on foreigners, especially during times of war (Author's note: my grandfather worked with the code-breakers in WWII -- it had a major impact indeed on winning the war). However, the U.S. export restrictions can easily be easily be bypassed, allowing many foreigners access to products with 128-bit encryption (example: https://www.ccc.de/). Likewise, it has stifled development within the United States of products that need encryption, such as IEEE 802.11 wireless Ethernet.

Key point: The debate over strong encryption is never ending. Within the United States, law enforcement is constantly lobbying to restrict the use of strong encryption. Many resist, pointing out how often law enforcement already abuses wiretap powers (such as against Martin Luther King). At the same time, companies making products constantly lobby for the easing of export restrictions, so that they can sell strong encryption products abroad. Another funny thing is that the U.S. government's intransigence on this issue has actually led to stronger encryption abroad. U.S. export restrictions (and desire to spy on foreigners) was one of the reasons France relaxed its own law-enforcement bans on encryption use by citizens.

Key point: The random number generators within systems are often weaker than the key itself. For example, when you connect via SSL from your browser to a web-server, they choose a key for that session. That key is chosen with a random number generator. One estimate was that the average 128-bit session key contains only 47-bits of randomness. Other browsers have had even weaker systems allowing the session key to be recovered in only a few minutes.

40-bit [1]
The term "40-bit encryption" refers to the U.S. encryption export laws (note: in January, 2000, the U.S. upped the maximum size to 64-bits. The U.S. restricts the export of "strong encryption" technology. Products that include 40-bit encryption or less can freely be exported. Therefore, products like web browsers, wireless communications, DVD keys, etc. all use 40-bit encryption.

Key point: Specialized hardware can decrypt 40-bit keys in real time. The average new desktop has enough horsepower to decrypt 40-bit messages. Thus, many people now consider 40-bit encryption to be simply obfuscated plaintext.

Key point: 40-bit often refers to the RC4 system within browsers.

56-bit [1]
56-bit encryption contains 16-more bits than 40-bit encryption, and is therefore 65536 times more difficult to crack. On the other hand, it is likewise 256 times easier to crack than 64-bit encryption.

Key point: In January of 1999, the EFF built a custom machine (the "Deep Crack") for $250,000 that could decrypt 56-bit DES encrypted messages in hours.

Key point: 56-bit cryptography almost always refers to DES.

64-bit [1]
In January of 2000, the U.S. government eased its export regulations of encryption 40-bit to 64-bit keys. Presumably, the government would only do so if the NSA had the capability of decrypting 64-bit encrypted messages. It is interesting to note that distributed.net's RC5-64 challenge cracking team of 100,000 computers working for about 2.5 years had managed only to check about 18% of the keyspace. This implies that the NSA has extremely hefty software.

8-character password [4]
Some systems, like Win9x and Solaris, limit the user to 8 characters in the password.

Key point: Security conscious users of such systems need to make sure they use a more random mix of characters because they cannot create long passwords.

Key point: Password cracking such systems is a little easier.

~user [3]
On UNIX, a home directory can be referenced by using a tilde (~) followed by their login name. For example, "ls ~rob" on my computer will list all the files in "/home/rob".

Key point: Web-servers often allow access to user's directories this way. An example would be http://www.robertgraham.com/~rob.

Key point: A big hole on the Internet is that people unexpectedly open up information. For example, the file .bash_history is a hidden file in a person's directory that contains the complete text of all commands they've entered into the shell (assuming their shell is "bash", which is the most popular one on Linux).

/dev/null [1]
On UNIX, this is a virtual-file that can be written to. Data written to this file gets discarded. It is similar to the file call NUL on Windows machines.

Key point: When rooting a machine, hackers will often redirect logging to /dev/null For example, the command ln -s /dev/null .bash_history will cause the system to stop logging bash commands.

Culture: In the vernacular, means much the same thing as "black hole". Typical usage, "if you don't like what I have to say, please direct your comments to /dev/null".

/etc [1]
The directory on UNIX where the majority of the configuration information is kept. It is roughly analogous to the Windows registry. Of particular interest is /etc/passwd file that stores all the passwords.

Key point: If a hacker can read files from this directory, then they can likely use the information to attack the machine.

/etc/hosts [1]
The file that contains a list of hostname to IP address mappings. In the old days of the Internet, this is how machines contacted each other. A master hosts file was maintained and downloaded to machines on a regular basis. Then DNS came along. Like the vestigial appendix. On Windows, this file is stored in %SystemRoot%\system32\drivers\etc.

Hack: If you can write files to a user's machine, then you can add entries to his/her hosts files to point to your own machine instead. For example, put an entry for www.microsoft.com to point to your machine, then proxy all the connections for the user. This will allow you to perform a man in the middle attack.

/etc/passwd [1]
The UNIX file that contains the account information, such as username, password, login directory, and default shell. All normal users on the system can read this file.

Key point: The passwords are encrypted, so even though everyone can read the file, it doesn't automatically guarantee access to the system. However, programs like crack are very effective at decrypting the passwords. On any system with many accounts, there is a good chance the hacker will be able to crack some of the accounts if they get hold of this file.

Key point: Modern UNIX systems allow for "shadowed" password files, stored in locations like /etc/shadow that only root has access to. The normal password file still exists, minus the password information. This provides backwards compatibility for programs that still must access the password file for account information, but which have no interest in the passwords themselves.

Key point: The chief goal of most hacks against UNIX systems is to retrieve the password file. Many attacks do not compromise the machine directly, but are able to read files from the machine, such as this file. Typical examples include:

TFTP
Typical exploit asks for the filename "/etc/passwd". Some systems are misconfigured so that this works.
FTP
Similar to TFTP above, simply asking for the file can get it. Directory climbing sometimes works. Sometimes a shell can be exploited to reveal the file.
HTTP
Many custom web-servers (such as built-in ones used for remote management) contain directory climbing bugs that can be used to retrieve the file. Example: http://www.robertgraham.com/../../../etc/passwd.
/cgi-bin
A huge number of CGI scripts contain bugs that can be exploited to read files from the system. These include directory climbing vulnerabilities, shell vulnerabilities, as well as other stupid mistakes.

Key point: /etc/passwd is a simple text file, with one line per account. The line is broken down into seven columns:

account
The username. Note that a lot of systems ship with well-known names in their default passwd file.
password
An encrypted form of the user's password. Since they are encrypted, they are viewable by anybody who has access to the system. However, since users often choose weak passwords, hackers will often run crack programs that can decrypt the weak passwords. For this reason, administrators often create a shadow password file that contains the real passwords, in which case this field will simply contain a "*".
UID
The user identifier, a unique number like "500" that identifies the user. Internally within the system, all users are referenced by their number rather than their name. One way to put a backdoor into the system is to place a string like "x500" rather than "500" in this field. This causes programs who read the file to parse this as the number "0", which is the UID for root.
GID
A primary group the user belongs to. The user can belong to secondary groups as configured in /etc/group.
GECOS
Some additional information about the account. For real users, this is often their full human readable name. For other pseudo-accounts, this may be some parameters.
directory
The user's home directory.
shell
The login shell that will be given to the user when they logon.

See also: shadowed passwords

/etc/services [3]
On UNIX, the configuration file /etc/services maps port numbers to named services.

Key point: Its role in life is so that programs can do a getportbyname() sockets call in their code in order to get what port they should use. For example, a POP3 email daemon would do a getportbyname("pop3") in order to retrieve the number 110 that pop3 runs at. The idea is that if all POP3 daemons use getportbyname(), then no matter what POP3 daemon you run, you can always reconfigure its port number by editing /etc/services.

Misunderstanding: This file is bad in order to figure out what port numbers mean. If you want to find out what ports programs are using, you should instead use the program lsof to find out exactly which ports are bound to which processes. If running lsof is not appropriate, then you should lookup the ports in a more generic reference.


- A -

[ A | Access Control List | accountability | ACK | Acknowledgement Number | active attack | ActiveX | administrator | age | AH | algorithm | anarchy | ANI | anonymous | anti-replay | anti-virus | application/form-url-encoded | ARP | ARP redirect | ASP | asymmetric cryptograph | attack | audit | auth | authentication | Authentication Header | Authenticode | authorization ]

Access Control List [3]
Controlling access not only the system in general, but also resources within the system. For example, firewalls can be configured to allow access to different portions of the network for different users. Likewise, even after you log onto a file server, the server may still block access to certain files.

Key point: An Access Control List (ACL) is used to list those accounts that have access to the resource that the list applies to. When talking about firewalls, the ACL implies the list of IP addresses that have access to which ports and systems through the firewall. When talking about WinNT, the ACL implies the list of users that can access a specific file or directory on NTFS.

Contrast: Discretionary Access Control is the ability to have fine grained control over who has access to what resources.

accountability [1]
In infosec, the word accountability refers to the ability to trace actions back to the person who did them. This includes finding out who violated security policies, as well as simple things as charging departments for their use of network resources.

Controversy: A major human rights debate these days is between accountability and anonymity. On one hand, you want to make criminals accountable for their actions, but this invades upon the privacy of individuals who do not want their every action recorded.

Contrast: The term accountability typical refers to the general issue of tracing actions back to individuals, whereas accounting refers to the process of actually recording those actions.

ActiveX [1]
A type of mobile code whereby Microsoft's web browsers can automatically download executables to provide active content within web pages.

Contrast: ActiveX is similar to Java applets, except that the code is not "sandboxed": it has full access to the operating system. In order to stop hostile code, ActiveX relies upon digital signatures and "zones". Microsoft browsers are configured to trust ActiveX programs from servers in the "trusted" zone, to trust signed ActiveX programs from servers in less trusted zones, and to prompt/deny unsigned ActiveX applets from untrusted zones.

Controversy: The idea of trusted zones and signed applets works pretty well in theory, but doesn't always work well in practice. The problem is that is relies upon on all users making the correct choices all the time. The Melissa virus/worm proved that this philosophy is not adequate.

advocacy [1]
  • EFF - Electronic Frontier Foundation
  • EPIC
  • CDT - Center for Democracy and Technology

algorithm [1]
A series of steps specifying which actions to take in which order. This general term in the security field generally refers to an encryption or algorithm.

Analogy: An cookbook recipe is an algorithm.

Key point: Different algorithms have different levels of complexity. For example, consider the ancient parable (Babylonian?) about a king and a wise subject who did a favor for him. The subject asked for one piece of grain to be placed on the first square of a chess board, two grains on the second, four grains on the third, and so on, doubling the amount of grain for each successive square.

This problem demonstrates an algorithm of exponential complexity. For the first 10 squares of the chess board, the series is: 1 2 4 8 16 32 64 128 256 512. Thus, for the first 10 squares, roughly a thousand grains must be paid out. However, the series continues (using K=1024): 1k 2k 4k 8k 32k 64k 128k 256k 512k. Thus, for the first 20 squares, roughly a million grains must be paid out. After 30 squares, roughly a billion grains must be paid out. For 40 squares, roughly a trillion grains must be paid out.

This is directly related to such things as key size. A 41-bit key is twice as hard to crack as a 40-bit key. A 50-bit key is a thousand times harder. A 60-bit key is a million times harder. This is why the 128-bit vs. 40-bit encryption debate is so important: 128-bit keys are a trillion trillion times harder to crack (via brute force) than 40-bit keys.

Key point: Most algorithms are public, meaning that somebody trying to decrypt your message knows all the details of the algorithm. Consequently, the message is protected solely by the key. Many people try to add additional protection by making the details of the algorithm secret as well. Experience so far has led to the belief that this actually leads to weaker security for two reasons. First, such secrets always get discovered eventually, so if security depends upon this secret, it will eventually be broken. Secondly, human intelligence is such that someone cannot create a secure algorithm on his/her own. Therefore, only by working with a community of experts over many years can humans create a secure algorithm. To date, only two such communities exist: the entire world of cryptography experts publishing the details of their work and trying to break other people's work, and the tightly knit community of cryptography experts working in secret for the NSA.

anarchy[1]
In the hacking culture, there is a strong belief in anarchy, that laws should not be created for cyberspace nor can they be enforced without grievous infringement on civil liberties.

Contrast: Cyberspace anarchy and real-world anarchy are different. The main thrust is that cyber-punishment should fit cyber-crime, and physical-punishment should only be used in cases of physical-crime.

Example: Most of the cyber-anarchy focuses on cryptography, or crypto-anarchy. This is because most anarchic capabilities will be based in cryptography.

ANI (Automated Number Identification)[3]
(U.S.) In telephones, ANI identifies the caller to the recipient. In most cases, ANI cannot be blocked, such as when dialing 800 lines. Consumer caller-ID is essentially based upon ANI functionality. (Consumers can prevent caller-ID from working so that other consumers don't receive it, but they cannot block ANI when dialing 800 lines).

anonymous [2] .
Anonymity is one of the "holy grails" of hacking. The idea is that a human being can use a system or send messages while protecting their identity from being disclosed.

Example: Anonymous e-mail services like Hotmail put the IP address of the person sending the e-mail in the headers (which are normally hidden from view by e-mail clients). Many would-be hackers get caught this way.

Example: France is currently trying to outlaw Internet anonymity, forcing uses to disclose their identity.

Contrast: Anonymity is one aspect of privacy.

ARP [3]
ARP is a protocol used with TCP/IP to resolve addresses. The TCP/IP stack used to transmit data across the Internet is independent from the Ethernet used to shuttle data between local machines. Thus, when machine needs to send an IP packet to a nearby machine, it broadcasts the IP address on the local Ethernet asking for the corresponding Ethernet address. The machine who owns the address responds, at which point the IP packet in question is sent to that Ethernet address.

Key point: By sniffing ARP packets off the wire, you can discover a lot of stuff going on. This is especially true of cable-modem and DSL segments. Since ARP packets are broadcasts, you aren't technically breaking your user's agreement by sniffing.

Key point: You can spoof ARP requests and/or responses in order to redirect traffic through your machine.

ARP redirect [3]
A tool that is part of the standard hacker's toolkit, ARP redirect will redirect Internet traffic from a local neighbor through your own machine allowing you to sniff it.

ASP (Active Server Pages)[3]
The server-side scripting language for Microsoft IIS web server.

Key point: A recurring bug in ASP has allowed hackers to read the script rather than the output of the script. These techniques rely upon changing the name of the script such that the server not longer recognizes it as a script, but as a file instead. Some techniques that have worked in the past have been:

/default.asp.
The file system automatically strips trailing dots because of the way Windows hides/appends file extensions.
/default.asp%2E
Same bug as above. Microsoft released a patch whereby the web-server checks for the appended dot. However, url-encoding the dot bypasses this quick fix.
/default.asp::$DATA
In order to support Macintoshes and other features, NTFS supports a feature known as alternate data streams. The well-known stream called "::$DATA" references the original
/default.asp%8129
Far east editions will expose the source when a Unicode character is appended.

attack [1]
In security, the word attack has taken on very specific connotations. For example, you might here of researchers trying to "attack a cryptosystem". The word is often used in the abstract sense rather than in any physical sense. This academic circles, this word is often used in preference to other synonyms such as crack or break.

Example: Some classifications of attacks are:

passive vs. active attacks
A passive attack (like sniffer) is one that can take place by eavesdropping. An active attack is one that requires interaction, such as injecting something into the data stream or altering data. All attacks are divided into these two categories. Note that active attacks can in theory be detected, while passive attacks cannot be.
hit and run vs. persistent attacks
A ping of death is a hit and run attack because it quickly crashes a machine. A smurf attack is persistent because the victim is affected only as long as the smurf lasts. As soon as the attacker stops smurfing, the victim's link becomes active again.
replay attack
An active attacker where you try to capture parts of a message then resend it at a later date, often with slight alterations. For example, on older Windows LAN Manager protocols, a hash of the password is sent. Therefore, anybody could right their own SMB protocol stack and replay the hash in order to break into the system.
brute force attack
Tirelessly tries all combinations until they can break in.
man in the middle attack
Either eavesdrops on an existing connection, or interposes himself in the middle of a connection changing data.
hijack
Takes over one side of an existing connection.
sniffing/wiretap/eavesdropping
A passive attack consisting of eavesdropping on a network connection.
rewrite
An attack that alters an encrypted message without first decrypting it. Block-ciphers

authentication [3] .
In cryptography, authentication is the method used to verify something is what is claims to be. The antonym of authentication is forgery. The two primary areas of authentication are user authentication (proving that Bob is who he says he is) and message authentication (proving that your nuclear missile launch orders weren't forged or corrupted).

Example: When you log in with your username and give the password, you are authenticating yourself to the system. You are proving that you are you because, in theory, only you know your password.

Key point: Abstractly, anything that combats forgery is called authentication. For example, IPsec includes an Authentication Header (AH) that proves that a packet hasn't been modified in transit.

Contrast: Note that there is a small difference between authentication and authorization. In one case, once you authenticate somebody's identity, your next step is to figure out if they are authorized to do what they are asking to do (i.e. log onto the server). In other cases, authorization is independent from authentication, such as not allowing anybody to logon after midnight.

Examples:

biometrics
Signature (handwriting), facial features, fingerprint, etc.
smart-card
passwords
digital certificates

Contrast: There are roughly three "factors" used in authentication:

physical (what you have)
car keys, subway tokens, driver's license, passport, credit cards, ID cards, smart cards
knowledge (what you know)
PINs, usernames/passwords, account numbers
biometrics (who you are)
signature, fingerprint, what you look like, etc.

audit [1] The word audit has two meanings.

The first is the security audit, whereby a consulting firm comes in and validates a companies security profile. This is conceptually similar to how accounting firms will come in every quarter and review a companies books.

The second term is infosec specific, and refers to an "auditing" subsystem that monitors actions within the system. For example, it might keep a record of everyone who logs onto a system.

authenticity [3] TODO

authorization [3] TODO

availability [3] In infosec, availability describes the need that resources must be continuously available. For example, in the Kosovo war, the European forces bombed power plants in order to destroy the availability of electricity. Another example is in February of the year 2000, when massive DDoS attacks brought down major websites (making them "unavailable").

Antonym: The opposite of the infosec term "availability" is the hacking term "DoS".


- B -

[ back channel | back door | Back Orifice | backticking | banner | BASE64 | bash" | BGP | big-endian | binary | BIND | BinHex | biometrics | BIOS | bit | BlackNet | block cipher | Blowfish | boink | bomb | bonk | boot sector | bootp | broadcast | browser | brute force | BS7799 | BSD | buffer overflow | buffer overrun | byte-order ]

back channel [4]
Where the compromised system opens a connection back to the hacker.

Contrast: Remote administration trojans (RATs) are NOT examples of back channels, but are instead forward channels. A RAT allows the hacker to contact the system from anywhere in the world, and allows the hacker to hide where he/she is coming from. A back channel, on the other hand, will contact the hacker, who must have a fixed IP address. This clearly fingers who the hacker is.

Key point: Typical back channel protocols are X Windows (xterm) and shells like Telnet. These programs are often built into the victim's system, so many attacks that can't otherwise compromise the system can still trigger a back channel that allows a remote shell.

See also: covert channel

back door (trap door)[3] .
Something a hacker leaves behind on a system in order to be able to get back in at a later time.

Example:

  • A Y2K programmer comes into fix your banking code, but leaves behind something in the software that allows him to log into an ATM and withdraw lots of money.
  • Somebody walking by your computer notices that you are logged in with root/administrator privileges. She creates her own account that will allow her to get back into the system at a later date.
  • A hacker breaks into your UNIX machine and installs a what is known as a "rootkits": a series of programs and configuration errors that will allow the hacker back into the system. There are so many items in a rootkit that it is unlikely the owner of the system can clean the entire thing out.
  • When a hacker breaks into your system, he leaves behind a program that will allow him to log in with a special username/password.
  • A hacker sends you a Trojan that installs a backdoor when you run it.

Key point: Key features of backdoors are:

  • They try to evade traditional "cleanup" methods. E.g. even if the administrator changes all the passwords, cleans the registry/configuration files, and removes all the suspect software, a good backdoor will still be live on the system.
  • They try to evade logging: if every incoming connection to the system is logged, there is a good chance the backdoor provides a way to log in without being logged.
  • They hide well. If you scan the system looking for suspect software, there is a good chance the backdoor has used techniques to hide from this scan.

Key point: Back doors are frequently programmed into systems either benignly or maliciously. Most computers shipped today allow BIOS passwords to be set that will prevent the booting of the computer without the administrator first typing the password. However, since many people lose their password, such BIOSes often have a back door passwords that allows the real password to be set. Similarly, a lot of remotely manageable network equipment (routers, switches, dialup banks, etc.) have backdoors for remote Telnet or SNMP. The frequency of such back doors is due to the fact that people are stupid, set passwords, forget them, then whine to customer support.

Key point: A backdoor can be added to any system. For example, when generating random session keys, a programmer may actually subvert the random number generator. Such subversion would then allow decrypting of the message by those who knew the specifics. This has already been done accidentally; some paranoids believe that some encryption products do this intentionally in order to get export approval of 128-bit products.

See also: trap-door

Back Orifice (BO)[2]
A remote access trojan released in 1998 by the Cult of the Dead Cow (cDc). By promulgating this throug their well-oiled propoganda machine, the cDc succeeded in making Back Orifice the archetype for all such programs. In 1999, the cDc released a newer version called "BO2K - Back Orifice 2000".

banner [3]
Many text-based protocols will issue text banners when you connect to the service. These can usually be used to fingerprint the os or service.

Key point: Many banners reveal the exact version of the product. Over time, exploits are found for specific versions of products. Therefore, the intruder can simply lookup the version numbers in a list to find which exploit will work on the system. In the examples below, the version numbers that reveal the service has known exploitable weaknesses are highlighted.

Example: The example below is a RedHat Linux box with most the default service enabled. The examples below show only the text-based services that show banners upon connection (in some cases, a little bit of input was provided in order to trigger the banners). Note that this is an older version of Linux; exploits exist for most these services that would allow a hacker to break into this box (most are buffer-overflow exploits).

Protocol Port Banner
FTP 21 220 rh5.robertgraham.com FTP server (Version wu-2.4.2-academ[BETA-15](1) Sat Nov 1 03:08:32 EST 1997) ready.
ssh 22 SSH-2.0-2.1.0 SSH Secure Shell (non-commercial)
Telnet 23 Red Hat Linux release 5.0 (Hurricane)
Kernel 2.0.31 on an i486
login:
SMTP 25 220 rh5.robertgraham.com ESMTP Sendmail 8.8.7/8.8.7; Mon, 29 Nov 1999 23:28:31 -0800
finger 79
Login     Name                 Tty  Idle  Login Time   Office     Office Phone

rob       Robert David Graham   p0        Nov 29 22:51 (gandalf)

root      root                  p1        Nov 29 23:34 (10.17.128.201:0.0)  

HTTP 80 HTTP/1.0 200 OK
Date: Tue, 30 Nov 1999 07:34:59 GMT
Server: Apache/1.2.4
Last-Modified: Thu, 06 Nov 1997 18:20:06 GMT
Accept-Ranges: bytes
Content-Length: 1928
Content-Type: text/html
 
POP3 110 +OK POP3 rh5.robertgraham.com v4.39 server ready
identd 113 0 , 0 : ERROR : UNKNOWN-ERROR
IMAP4 143 * OK rh5.robertgraham.com IMAP4rev1 v10.190 server ready
lp 515 lpd: lp: Malformed from address
uucp 540 login:

Defenses: Many systems allow banners to be suppressed. You should read the software documentation for more information on this.

BGP (Border Gateway Protocol)[3]
On the Internet, BGP is used between ISPs in order to communicate routers. For example, imagine that the ALICE ISP needs to reach the BOB ISP. However, ALICE is not directly connected to BOB. ALICE therefore must figure out which ISP should be used to send traffic to BOB. It is through the use of BGP that such information is discovered. The name "border" comes from the fact that ISPs use BGP only on their borders (in contrast, they would use some other protocol (like OSPF) inside their networks).

Key point: BGP can be subverted in numerous ways. BGP is generally unauthenticated, and rogue ISPs can play havoc.

binary [1]
One of the basic foundations upon which computer science is based, binary is simply the concept of representing all things as a series of 1s and 0s. Mathematically, this means that all numbers are represented in base2 arithmetic, and that all things are represented with numbers.

Contrast: The word binary usually means not text. In computers, every 8 binary digits are used to represent a byte. However, only 7 binary digits are needed to convey text (26 upper case, 26 layer case 10 decimal digits, a number of punctuation characters, etc). Therefore, data using just 7 binary digits per bytes is always text data. It is pointless to say binary computer data, since all computer data is binary. When someone says binary, rather than being redundant, what they are really trying to convey is that the data in question isn't text data. For example, FTP is a text protocol, whereas SMB is a binary protocol.

Misconception: The word is also a noun (as well as the usual adjectival sense). A binary is a file containing binary (as opposed to text) data. In particular, you might hear the phrase "hackers replace the binaries on a the victim's machine". What this really means is that the hackers have replaced many of the software programs (with trojans). This phrase comes about because executable programs contains binary, not text data. Therefore, a machine's binaries are its programs.

See also: A common issue is how to send binary data within a text protocol/message. For example, how can we send a binary within a text e-mail message? The answer is to "encode" the data. See the word encoding for more details.

biometrics [3]
In the field of authentication, biometrics is the method whereby a person is recognized according to personal traits, presumably ones they cannot alter. Typical examples are signatures we sign on documents and facial recognition that we use in everyday life.

History: The ancient Egyptions used biometrics in order to verify somebody's identity. They would make several measurements of body features (e.g. length of arms) and record them. Fingerprints have actually only been used in the last 100 years.

Example: The market for biometrics in the year 2000 was roughly $100 million. There are many methods, each with their own pros and cons (accuracy, ease of use, end-user prejudice, etc.).

fingerprints 40%
The old standby that everyone is familiar with, though they carry a certain stigma due to their longterm use in law enforcement. Most such systems use just the thumbprint. California is now requiring thumbprints for its driver's licenses.
hand 30%
This is generally your palm print, though it can also include the geometry of your fingers.
voice 15%
Due to numerous problems (such as a cold affecting a person's voice and recorded playback), this method is becoming less popular. It's chief benefit is that it can use any microphone to record the voice, and any modern computer can do the necessary analysis on the voice signal. Some of these systems have been used for telephone authentication.
face 7%
Tends to focus on facial features between forhead and lips in order to avoid complications with hair style, facial hair, and facial expression. Some scanners do thermal imaging of the face, which in theory can distinguish among identical twins (which could otherwise stymie other systems).
eye 4%
Includes iris as well as retina scanning. The iris is the outer part of the eye that we associate with eye color. The retina is inside the eye, from which distinct patterns of blood vessels can be measured. This system is considered the most accurate, but at the same time it is technically difficult to get right (as users have to be trained to position their eye's correctly).
handwriting signature 3%
The same system used to sign your checks. Some systems are just for a person's signature, others try to encompass the entire person's handwriting. This method is becoming more popular for PDAs. An issue with this system is that it is behavioral, rather than physical.
Other
Gait (how you walk), typing characteristics, body odor, DNA (movie Gataca), reflection of radio waves within the body, reflection/resonance of sound waves within the skull.
Voice and signature recognition are considered some the least reliable techniques, though they are among the more friendly.

Point: One area of biometrics focuses on those cases where the user isn't aware of the scan. For example, an airport might have a facial features scanner design to trigger on known terrorists. Equipment could be installed under the floor in order to discover people according to their gait as they walk over them (such systems can distinguish among multiple people walking simultaneously). Body odor and DNA can be extracted from a persons "thermal plume" as they walk under a sniffing system.

Controversy: Biometrics introduces huge privacy debate. For the first time, it provides the government with a means to track its citizens in a manner that the citizens cannot avoid. This gives totalitarian governments the ability to tightly control their populations. At the same time, it provides businesses equal opportunity to invade their employees and customer's privacy.

Controversy: Biometrics is based upon a single, unalterable identity. A private-key, for example, can be destroyed in case it is compromised (through key revocation). However, your biometrics are with you for life. Today's authentication is usually through pseudonyms that are only roughly related to who you really are.

Key Point: Biometrics has a number of problems. The first is that biometrics degrade over time. People's signatures change over time. An injury can change fingerprints. Voice recognition systems fail when people have cold. Not all people have the requisite physical features (eyes, hands, etc).

Pros: Biometrics cannot be forgotten; many companies are adopting biometrics as a cost saving issue because lost passwords is becoming a leading problem in IT departments. Biometrics cannot be passed on from one person to another. Biometrics are extremely difficult to forge.

Culture: Biometrics have appeared frequently in movies, partially because of the Orwellian horros they ellicit from the audience. The entire plot of the movie Gataca was based upon DNA biometrics. The Bond film "Diamonds are Forever" used a trick of thin rubber over the fingertips to forge someone else's fingerprints -- a trick that has been recently shown to work. Another Bond film used the trick of surgical alteration in order to forge an iris scanner.

BIOS [3]
On your PC, the BIOS is the software the first runs when your computer starts up. All the messages you see when it starts up are from the BIOS program. Once it gets through testing memory and configuring your system, it then "boots" the operating system that you've installed on your hard-disk.

Key point: The BIOS stores configuration settings in NVRAM (Non-Volatile RAM). Remember that the contents of your normal RAM/memory are lost when you power-off your computer. The contents of NVRAM, in contrast, are retained when power goes off. Most NVRAM consists of CMOS (low-power) chips with a small battery that constantly feeds power to the chips (such batteries last about 5-years). A common trick of hackers and viruses is to corrupt the CMOS settings causing the computer to fail to boot. Removing the battery connection (usually a jumper on the motherboard) will cause the CMOS settings to be lost and be reset back to default (good) state.

Key point: All of today's BIOSes are stored in programmable ROMs, which allows them to be reprogrammed (usually with bug fixes from the manufacturer). This allows the hacker to reprogram them as well. While in theory the hacker could reprogram his/her own code into the BIOS, in practice this has not been done yet. Instead, hackers can sometimes use this programming feature to corrupt the BIOS code (in much the same way they corrupt the BIOS settings mentioned above). This will usually prevent the system from booting even to a point where a fresh BIOS can be re-programmed into the system. This requires that the system be brought back to the vendor in order to have the BIOS reprogrammed. Note that you can often set a jumper on the motherboard that denies the ability to reprogram the BIOS.

BIND [3]
The most popular software on the Internet for providing DNS services. Your ISP is likely running BIND.

Key point: BIND provides about 80% of all DNS services. It is also enabled by default on a lot of Linux distributions. As a result, any exploit discovered for BIND has immediate and large impact on the Internet. As of November, 1999, all versions of BIND previous to 8.2.2-P5/4.9.7 have known holes that can be exploited. It is likely that these newer versions also have undiscovered exploitable holes as well.

Key point: BIND comes in two versions, 4.x and 8.x. This is largely due to backwards compatibility: people are running a lot of older servers and would rather patch them than upgrade to a newer version. Also, the newer 8.x code-base has not be extensively peer-reviewed and is thought to be a lot less secure than the 4.x source base.

See also: dig, DNS

bit [1]
A numeric quantity with precisely two values, such as 0 and 1, false and true, up or down, and so forth.

Key point: In many contexts, each additional bit means "twice as much". 8 extra bits means 256 times as much. 16 extra bits means 65536 times as much. Therefore, it takes 65536 times longer to brute force crack a 56-bit key than a 40-bit key.

BlackNet [2]
A cultural term referring to an anonymous black-market in hacker goods, especially information. Think of it as an eBay where both buyer and seller can be totally anonymous and information is the item being traded. Let's say that a hacker steals trade secrets from a company; the hacker would then be able to sell this on the auction. The idea of BlackNet is rooted in cryptography. First of all, there as to be complete anonymity. Secondly, there has to be solution to the race condition where the buyer has to be assured he is getting the goods before delivering payment, and the seller has to be assured of receiving payment before delivering the goods. Finally, the problem of fraud (misrepresentation of goods) has to be solved: the seller has to prove he has the goods claimed. Cryptographic solutions to these problems do exist; such a market is possible, though it does not yet truly exist.

bomb (logic bomb or mail bomb)[3]
The word bomb has two unrelated meanings: logic bombs and mail bombs.

In the class of hostile software, a logic bomb is some code left behind by a program that "goes off" at a particular time (such as deleting all the files on the computer on New Years Eve). One theory was that Y2K consultants left logic bombs inside the code they were fixing in order to earn even more money after Y2K.

A mail bomb is the effect of sending somebody tons of e-mail, overloading their mailbox and/or network connection. Sometimes this can be done with a program, other times it can be done simply by signing up the victim to huge numbers of e-mailing lists. Finally, it can be accidental, as happened once to Apple Computer when its mailing list software got out of control.

History: In the old days of UNIX terminals, an e-mail message containing VT100 control codes in a logic bomb could completely hose a user's terminal, forcing them to log out. DOS machines supporting the ANSI.SYS driver also had that problem.

bootp (boot protocol)[1]
This relative ancient protocol facilitates booting devices ("clients") from a network server rather than their local hard-disks (such as diskless workstations). In this configuration, the bootp protocol configures the diskless device with its IP configuration information as well as the name of the file server. At this point, the client shifts to TFTP to download the actual files it will use to boot from.

Key point: DHCP is simply an extension on top of bootp. This is important because without an IP address, clients cannot reach bootp servers that reside across routers. Virtually all routers have an extension for bootp forwarding that fixes this issue. Since DHCP had the same requires, the designers just stuck it inside bootp packets rather than requiring yet another change to the routing infrastructure.

boot sector (boot record)[1]
The first sector on a driver where the operating system will bootstrap from.

Key point: Until macro viruses came along, boot sector viruses where the most common variant. They spread through companies via floppy disks. Users would leave floppy disks in the drive and when the computer restarted, it would attempt to boot from the floppy. This would run the virus, which then infected the boot sector on the hard drive. Any further floppies plugged into the system would then be infected by the virus.

Countermeasures: I worked at a company with anal anti-virus procedures (anti-virus on all desktops, regular wiping of floppy disks). It was never able to completely free itself from the boot sector virus problem; one of the viruses was never successfully eradicated from the company. My own personal policy is to disconnect the floppies on 90% of the machines, and disable floppy bootup on the remaining machines.

'bot [2]
Short for robot, a 'bot is an automated program that does something.

Example: A cancel-bot is a program that attempts to cancel lots of messages within USENET newsgroups. These are sometimes used by the USENET Death Penalty or rogue cancellers. *

Example: Search engine spiders that index the web follow web-page links, going from site to site, downloading web-pages.

Example: In the IRC wars, hackers run automated bots to control channels. These are programs (usually in C) that help in administering channels, protection against hackers, flooding, and so forth.

broadcast [1]
The term "broadcast" is generic and is used in many different area. The origin of the term obviously means to cast out broadly, such as a radio broadcast.

Subdefinition: Ethernet has broadcast domains, allowing you to partially sniff some data from your neighbors, and possibly subvert it. Typical protocols that can be sniffed and subverted in this manner are: ARP, NetBIOS, MSBROWSE, rwho, bootp/DHCP, SNMP. An Ethernet broadcast address is "FF:FF:FF:FF:FF:FF".

Subdefinition: The Internet protocols TCP/IP support a feature known as a directed broadcast, which allows a remote person the ability to send a single packet to an entire subnet. This will then take advantage of the Ethernet broadcast domain once it reaches its destination. Attacks like smurf take advantage of this. A directed broadcast address looks something like 192.0.2.255, where the last integer "255" means "all devices on subnet 192.0.2.x".

Subdefinition: The special IP address of "255.255.255.255" is the local broadcast, and causes the packets to be sent to everyone locally, but not across the Internet.

browser [1]

Key point: Netsape and Microsoft have not yet produced a browser that is hardened against predation from hostile websites.

Key point: Disabling Java, JavaScript, and ActiveX will lock out virtually all hacks against the browser. However, this will also lock out many websites.

brute force [3]
A classic attack technique whereby all possible combinations are attempted until one succeeds. This typically refers to cryptography, either finding the right key to decrypt a message, or discovering somebody's password.

Analogy: If you somehow steal somebody's ATM card, you could try to use it in a bank machine. PIN numbers are only 4 digits, meaning 10,000 possible combinations. If you were patient, you could stand at the cash machine trying all possible 10,000 combinations. (Of course, ATM machines will always eat the cards after a few unsuccessful tries in order to stop this).

Key point: The term brute force often means "the most difficult way". In the above example of the PIN number, you can always find the PIN number after guessing 10,000 combinations. But sometimes there are easier ways. For example, a bank may choose to assign PIN numbers based upon a combination of the issuing date and the user's name. Therefore, the problem is reduced to guessing when a card was issued, which may consist of only a few hundred guesses.

Therefore, any technique that is more difficult than brute force is pointless. Likewise, brute force is very difficult, so hackers continually search for techniques that are less difficult.

Key point: The possibility of doing brute-force key-space searches is often compared to the age of the universe, number of atoms in the planet earth, and the yearly output of the sun. For example, Bruce Schneier has calculated that according to what we know of quantum mechanics today, that the entire energy output of the sun is insufficient to break a 197-bit key.

BS7799 (British Standard for Information Security Management, British Standard 7799)[3]
In Britain, BS 7799 is a set of "best practices" guidelines for information security, and more importantly, how organizations can demonstrate compliance to independent accredited auditors and receive certification. First published in early 1995, it really was the first guidelines by a standards body that could reasonably be implemented by commercial industry (from small to large businesses). It is thought of as the ISO 9000 of infosec, and will certainly have a strong influence on whatever the ISO eventually ratifies (currently assigned the tentative number ISO/IEC 17799-1). It had a strong influence on the HIPPA guidelines created in 1999 designed to protect privacy within the United States health care industry. BS 7799 was updated in 1999 to include controls for e-commerce, mobile computing. teleworking, and outsourcing.

Misconception: Certification doesn't been the business cannot get hacked. Rather, it certifies that the business is aware of its security risks, has identified how it is going to manage those risks, and has communicated this information broadly within the organization. For example, a business could put out a website with the statement "we don't care if it gets hacked" and be within compliance. They just need to identify this fact and publish it within the organization.

See also: Common Criteria, CDSA

buffer overflow (buffer overrun)[2] . . . . .
Username:

This form limits input to 10 characters; the browser won't let you type more than that because the form was programmed with a maxlength=10 parameter. However, when this form is submitted, it will actually be sent as a URL that looks something like http://www.robertgraham.com/pubs/test.html?username=robert. Lazy programmers know that browsers will never submit more than 10 characters, and write code that will break if the user submits more. As a hacker, you could simply go to the top of your screen and edit the URL to look something like http://www.robertgraham.com/pubs/test.html?username=robertxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx. This may crash the target system or allow you to bypass password checks.
A classic exploit that sends more data than a programmer expects to receive. Buffer overflows are one of the most common programming errors, and the ones most likely to slip through quality assurance testing.

Analogy: Consider two popular bathroom sink designs. One design is a simple sink with a single drain. The other design includes a backup drain near the top of the sink. The first design is easy and often looks better, but suffers from the problem that if the drain is plugged and the water is left running, the sink will overflow all over the bathroom. The second design prevents the sink from overflowing, as the water level can never get past the top drain.

Example: In much the same way, programmers often forget to validate input. They (rightly) believe that a legal username is less than 32 characters long, and (wrongly) reserve more than enough memory for it, typically 200 characters. The assume that nobody will enter in a name longer than 200 characters, and don't verify this. Malicious hackers exploit this condition by purposely entering in user names a 1000 characters long.

Key point: This is a classic programming bug that afflicts almost all systems. The average system on the Internet is vulnerable to a well known buffer overflow attack. Many Windows NT servers have IIS services vulnerable to a buffer overflow in ".htr" handler, many Solaris servers have vulnerable RPC services like cmsd, ToolTalk, and statd; many Linux boxes have vulnerable IMAP4, POP3, or FTP services.

Key point: Programs written in C are most vulnerable, C++ is somewhat less vulnerable. Programs written in scripting level languages like VisualBasic and Java are generally not vulnerable. The reason is that C requires the programmer to check buffer lengths, but scripting languages generally make these checks whether the programmer wants them or not.

Key point: Buffer overflows are usually a Denial-of-Service in that they will crash/hang a service/system. The most interesting ones, however, can cause the system to execute code provided by the hacker as part of the exploit.

Defenses: There are a number of ways to avoid buffer-overflows in code:

  • Use programming languages like Java that bounds-check arrays for you.
  • Run code through special compilers that bounds-check for you.
  • Audit code manually
  • Audit code automatically

Key point: The NOOP (no operation) machine language instruction for x86 CPUs is 0x90. Buffer overflows often have long strings of these characters when attacking x86 computers (Windows, Linux).

Key point: In a successful buffer overflow exploit, the hacker forces the system to run his own code. Since most network services run as "root" or "administrator", the exploit would give complete control over the machine. For this reason, more and more services are being configured to run with lower privileges.


- C -

[ C | CA | cable-modem | cache | CALEA | camping | cancel-bot | Carnivore | carrier-scanning | CDSA | central office | certificate | Certificate Authority | CGI | cgi-bin | chaining | challenge | chat | checksum | chosen plaintext | cipher | ciphertext | circuit switched network | clear-text | cmos | CO | Code | codebook | colo | command-line | Common Criteria | community strings | compiler | complexity | compression | con | confidentiality | cookie | covert channel | crack | cracker | crackz | CRC | credentials | cron | cryptanalysis | crypto | cryptographic | cryptography | CSN | culture | cyberpunk ]

C programming language [3] .

Point: The language is quirky, difficult for beginners to learn, and really just an accident of history. Despite this, one must grok the language in order to become an elite hacker.

Key point: The large number of buffer overflow exploits is directly related to poor way that C protects programmers from doing the wrong thing. On the other hand, these lack of protections leads directly to its high speed.

cable-modem [1]
A local technology for connecting users to the Internet, the cable-modem is based upon the same wire that brings cable television to the home. A cable-modem tunes into reserved "channels" in order to receive Internet content. Usually, a block of several TV channels are reserved for downstream connection, and one or two channels are reserved for the upstream.

Key point: If you built your own hardware, you could likely build a sniffer to spy on your neighbor's Internet traffic. Some cable-modem segments can even be sniffed without special hardware by anybody who reconfigures their machine. Some cable-modem segments allow you to redirect a neighbor's traffic through your machine, which you can then sniff.

Key point: Your neighbors are open to lots of hacking techniques that are not generally possible from across the Internet. First, your machine will receive broadcasts from your neighbors. These broadcasts basically advertise your neighbor's presence telling you how to hack into them. For example, neighbors who share their hard-drives will advertise themselves in the Window's Network Neighborhood. UNIX machines will also advertise a lot of information, such as through the 'rwho' mechanism. There are also lots of non-Internet protocols that appear on the local wire that can be used to break into your neighbors.

cache [3]
In general computer science, the word cache means simply to keep things around in case they are used again. For example, when you log onto your system, your username and password are stored in a cache in memory, because they are repeatedly used by the system every time you access a resource.

Key point: Sometimes systems can be exploited through the cache. Examples are:

HTTP proxy servers
Companies use these so that thousands of users can share a single Internet connection. They store recently used webpages so that when multiple users access the same web-site, the proxy server only has to go across the link once in order to fetch the page for all the users. A never ending series of bugs leads to conditions whereby when one user logs into a website, other users can see that first user's data.
Web-browser history/file cache
Once a hacker breaks into a machine, he/she can view the history cache (list of URLs) or file cache (the actual contents of the web-sites) in order to spy on where the user has been. Embarrassing, inadvertent disclosure of this information by users with certain surfing habits is common.
Web-browser cookie cache
Lots of web-sites store passwords within cookies, so that stealing somebody's cookie information will allow a hacker to log in as that user.

CALEA (Communications Assistance for Law Enforcement Act, digital telephony law)[2] .
CALEA was passed by congress in 1994 to preserve the "status quo": allowing law enforcement to tap digital lines with the same ease in which they tap analog lines. It requires phone companies (common carriers) to make sure their systems will support wiretapping. This required existing systems to be retrofitted (estimated cost: $500 million) as well as requiring that all new technological developments support wiretapping.

See also: key recovery, Carnivore, ECPA

camping [2]
A hacking technique whereby the intruder monitors a range of ISP dialup lines. As soon as a user dials-up, the hacker is notified and automated attack scripts are run. For example, it may ping the range continuously, and as soon as a ping responds, a script is run that attempts to connect to File and Print Sharing and read files from the hard-disk.

Key point: When dialing up to an ISP, the first 10 minutes are the most dangerous.

Carnivore [1]
A type of sniffer written by the FBI that scans Internet traffic looking for e-mail messages. It matches the "From:" and "To:" field of e-mail messages for names of suspects. If these fields match their criteria, the e-mail messages will be stored to the disk.

Misconception: The FBI does not install this on networks. They have to provide a search warrant to an ISP for the e-mail. Carnivore is one of the ways the ISP can fulfill the demands of the search warrant.

See also: CALEA

CDSA (Common Data Security Architecture)[3]
A standard from the Open Group designed to secure communications.

Resources: http://www.opengroup.org/pubs/catalog/c902.htm

central office (CO)[3]
In the telecom infrastructure, the central office (CO) is where all the local telephone lines come together. For example, in your home you have a pair of copper wires that lead all the way back to some building within a few miles of your home. This building has huge bunches of copper cables leading into it that are connected to the telephone company's equipment. From there, you voice is digitized and sent through the rest of the phone network.

certificate [3]
In PKI, a certificate contains the public key of the owner, and is signed by a trust trusted CA.

Key point: Certificates can be revoked. This means that a company who believes that their site has been compromised can put up a server on the Internet that tells everyone else that the certificate is no longer valid.

Key point: The Verisign embedded certificates in older browsers (IE 3.0, Netscape 4.0) have expiration dates of January 1, 2000. This means that anybody using older browsers will get nasty warnings when they visit e-commerce sites or attempt to verify files with authenticode.

Certificate Authority (CA) [3]
A trusted authority who signs certificates.

Key point: The way it is supposed to work is that you have a certificate that claims to be Microsoft signed by Verisign (a popular CA), then you trust that Verisign has done a reasonable job both ensuring that Microsoft is who they say they are, and that Microsoft has done a reasonably good job protecting their private keys from theft.

Contrast: Microsoft could create a "self-signed" certificate, but then anybody else could create a self-signed certificate claiming to be Microsoft. Therefore, you trust a CA-signed certificate more than a self-signed certificate, as long as you trust the CA.

Key point: How do you trust a CA? The answer is marketing. First, a company like Verisign has spent millions of dollars creating a reputable company that would be destroyed if a flaw was found in their process (i.e. thieves were able to steal their private keys). Second, Verisign (and a few other CAs) have managed to embed their public keys within Internet Explorer and Netscape Navigator. This means that any website using SSL must obtain a certificate signed by one of these built-in CAs, or else users get confusing warning messages.

Humor: Microsoft uses certificates signed by Verisign, because it is trusted by many people. The reason so many people trust Verisign these days is because its root keys are included with Microsoft's browsers.

Key point: One of the chief RISKS is the theft of the private key used to sign things. If a hacker/thief is able to steal it, then they can masquerade as someone

Key point: Several important CA certificates (i.e. Verisign) expired on Dec. 31, 1999. Since it is feasible to eventually compromise a certificates, they usually expire at some date. The certificates for trusting root CAs that are built-in many browsers (Internet Explorer 4.0 and earlier, Netscape Navigator 4.06 and earlier) were created in 1995, and were made for a 5-year lifespan. One of the creators of these certificates now says he wished he'd put the expiration date a little off, such as on Dec. 15, in order to avoid the Y2K madness.

cgi-bin (CGI, Common Gateway Interface)[3] .
On web-servers, CGI is a standard for creating dynamic content. When you request a document in the /cgi-bin directory, instead of sending you the document, the web-server passes your request to the named program/script. This program generates the requested document on the fly, usually based upon the contents of a backend database. The word "CGI" stands for "Common Gateway Interface", which generally confuses people more than help them.

chaining [4]
For block-ciphers, chaining the technique of combining the information from previous blocks into the encryption of the next block such that the same pattern in a message will not be encrypted the same way twice.

challenge [3]
A method to authenticate users that avoids sending passwords over the network. It goes something like this (though the details among various programs are different).
  • the client requests access
  • the server sends back random data
  • the client then encrypts/hashes the data using the password
  • the server checks the result
In this manner, the client proves it knows the correct password without ever sending it across the wire.

Key point: In most cases the user is prompted for the password, which the client then stores in memory. In the use of smart cards, however, the system may give the user the challenge string, which the user then types into the smart card. The smart card then produces a response, which the user must type back into the system. In this way, the user validates that they have the smart card.

Key point: Challenge-response systems are thought to be more secure because the challenge/response is different every time. This guards against replay attacks as well as making cracking more difficult.

chat [2]

Key point: Favorite because it provides real-time anonymous communication.

checksum [1] .
A technique for detecting if data inadvertently changes during transmission. The sender simply divides all the data up into two-character numbers, then adds all the numbers together. The receiver makes the same calculation, and checks the calculated checksum with the transmitted checksum. If they don't match, then the receiver knows the data was corrupted in transit.

Key point: Checksums are not secure against intentional changes by hackers. For that, you need a cryptographic hash.

cipher (decipher)[4]
In cryptography, the word cipher refers to an encryption algorithm. A cipher transforms the original data/message into pseudo-random data/message of the same length. In order to decipher the message, a reverse transformation must be applied.

Key point: A block cipher is one that encrypts a block of data at a time. For example, DES uses a block size of 64-bits. Each input block must correspond to exactly one output block (like a codebook). A block-cipher suffers from the fact the same data repeated in a message would be encoded in the same way. Consider a block size of 8-bit encrypting English text; you could therefore figure out all the letter 'e's in the cipher text because they are the most common letter used. Therefore, block-ciphers are often used in a chaining mode such that the same pattern will indeed be decrypted differently.

Key point: A stream cipher is essentially a chained block cipher with a block size of 1 (either 1-bit or 1-byte). It generates a keystream against which it XORs the plaintext, operating much like a one-time pad, though less secure in theory but more secure in practice.

ciphertext [4]
In cryptography, ciphertext refers to the data after it has been encrypted.

Contrast: clear-text, plaintext.

clear-text [4]
In cryptography, the term clear-text refers to messages that have not been encrypted. The word has the connotation of data that should be encrypted, but isn't (such as clear-text passwords).

Misunderstanding: The word text comes from traditional cryptography that meant the text of messages, though these days text can refer to binary computer data as well.

cmos [3]
When the system is powered off, some persistent BIOS settings are stored in a small bit of battery sustained RAM built using CMOS technology. The name "CMOS settings" have become synonymous with "BIOS settings". Some viruses have been known to corrupt these settings, resulting in a condition where the machine can no longer boot. Simply setting a jumper to disconnect the battery backup will restore the settings back to factory defaults.

codebook [4]
In ancient times, a codebook was a book where you looked up a word, and replaced with another word according to the substitution table in the book. For example, you might look up the words "attack at dawn" in the book and come up with the words mouse dog cat that you send to your troops. The troops receiving the message would likewise look up these words in their codebooks in order to figure out the original message.

Key point: In block-ciphers, the key represents a codebook. In other words, you could use the key to generate a huge book of matching pairs whereby each plaintext block would match to exactly one ciphertext block. Then, you could encrypt messages by looking them up in this table.

Key point: The term ECB or Electronic Code-Book refers to the use this mode of using a block-cipher. However, since it leaks information, many people prefer to chain blocks of ciphertext and plaintext together in order to make sure that the same pattern will be encrypted differently when it appears multiple times in a message.

colo (colloc, collocation facility)[1]
A collocation facility is one where many different people host their websites on machines located in the same facility. Most of the major websites are the Internet are hosted at collocation facilities rather than at the company's own headquarters. The main service that collocation facilities provide is "uptime". They provide redundant power supplies with backup generators, as well as redundant links to the Internet. Systems within colo facilities are usually rack mounted and secured by private cages.

Example: Continuous power is one of the major features that a collocation facility might provide. The theory is that a website doesn't need a UPS because the the collocation facility will be more reliable than the UPS itself. For example, I host a site at a colo that has connects to two separate city grids for power, with their own battery backup system, as well as their own generator. They occasionally unplug themselves from the city grids in order to test their system. Likewise, they bring multiple power feeds to racks so that a system with multiple power supplies can get its power from independent grids

Key point: Major colos have visually impressive security. However, they really aren't at the same paranoid level as the military, CIA, NSA, or banks (and probably won't be until a major physical security breach occurs). Their network security is extremely weak, often forcing customers to share common broadcast domains, which would allow one customer to subvert another's traffic.

command-line (command-prompt, DOS prompt, shell, CLI, command-line interface)[1]
One of the two fundamental user interfaces. Whereas most people are familiar with "graphical user interfaces (GUIs)" using windows and mice, the command-line provides a raw interface into the inner workings of the computer.

Key point: The average hacker does all his/her work from the command-line. Virtually all hacker tools are command-line oriented.

Common Criteria (CC, ISO 15408)[3] .
Full name: Common Criteria for Information Technology Security Evaluation.

CC is a set of government-oriented standards designed to create a commonly agreed upon criteria in which to describe and judge infosec. For example, if you want to purchase a "secure" computer from a vendor, the CC gives you a common set of criteria with which to evaluate that system. If you want to talk about infosec issues, the CC gives you a language in which to describe them. These common criteria were put together by government departments from Canada, France, Germany, the Netherlands, Great Britain, and the United States (both NIST and the NSA).

Controversy: The CC defines terminology uses terminology that is far from the infosec mainstream. Furthermore, many believe that products that match the criteria would be secure, they would also be worthless (in much the same way that a computer turned off, unplugged, and locked in the basement is secure from remote attacks).

Key point: The CC breaks down security functionality into the following areas:

auditing [FAU]
  • recognizing which events should be audited
  • how such events should be recorded
  • how the events should be stored, such as in a protected database
  • analyzing and correlating the events
For example, every user-logon event should be recorded. The system recording it might pass it on to a "write-only" database without the ability to later erase it (the first thing a hacker might do when he/she breaks into a system). Later events might be correlated to the logon event.
crypto [FCS]
  • encrypting communication
  • identifying/authenticating users
  • key management
communications [FCO]
non-repudiation
user data protection [FDP]
Protecting data during import/input, storage, and export/output.
identification and authentication [FIA]
Identification of who the user is and what rights/abilities/authority/attributes does that user have to access systems and data.
management of security [FMT]
Different management roles, where different managers have separate capabilities
privacy [FPR]
protection of the security functions [FPT]
Securing the security management data itself (as opposed to the user data).
resource utilization [FRU]
Resources are things like CPU time, disk storage, network bandwidth. Concerns are fault tolerance, prioritization, and resource allocation among different users.
access control [FTA]
identification/authentication, number of logons, access history, access parameters like time-of-day they may log in.
trusted path/channels [FTP]
Between users and the system as well as between systems.

Resources: http://csrc.ncsl.nist.gov/cc/

compiler [1]
In programming, a compiler takes human readable source code and converts it into the binary code that the computer can understand.

Key point: A compiler is a form of lossy compression and one-way encryption. All the information meaningful to humans is removed from the code leaving only the information necessary for the computer. This means that humans can no longer easily read the resulting program directly. Because of the "one-way" nature of the operation, programs cannot be used to recover the existing source code. This effect is different in various languages. C++ is the worst language in terms of decompilation; Java is the best. Most Java applets can be decompiled back to some semblance of their previous form. This has led to a market for programs that further obfuscate Java binaries in an effort to hide the original source code. Some compilers do leave human-readable symbols behind for debugging purposes. They won't reveal the original source, but can still be useful for reverse engineering They can be "stripped" from the binary.

Computer Fraud and Abuse Act of 1986 [3]
A law passed to clarify computer crimes and computer fraud. It established two new felony offenses: breaking into federal computers and trafficking in computer passwords.

complexity [3]
In computer science, complexity measures how difficult a problem is to solve. The problem is that while we may know of an algorithm that solves a problem, it will take a computer too long to solve it.

The best way to understand complexity is to consider the ancient parable (Babylonian?) about a king and a wise subject who did a favor for him. The subject asked for one piece of grain to be placed on the first square of a chess board, two grains on the second, four grains on the third, and so on, doubling the amount of grain for each successive square.

1 2 4 8 16 --- --- ---
--- --- --- --- --- --- --- ---
--- --- --- --- --- --- --- ---
--- --- --- --- --- --- --- ---
--- --- --- --- --- --- --- ---
--- --- --- --- --- --- --- ---
--- --- --- --- --- --- --- ---
--- --- --- --- --- --- --- ---

The question is: how much grain does this come out to? Your possible choices are:

  • a few handfuls
  • a few buckets
  • several wagon's full of grain
  • all the grain produced by the kingdom in a year
  • more than the combined total ever harvested by mankind

The problem is known as having exponential complexity. The average computer scientist, when confronted with this problem, would intuitively guess the correct answer, which is that the amount of grain is a billion times a billion, or more than all the grain ever harvested by mankind.

1 2 4 8 16 32 64 128
256 512 1024 2048 4096 8192 16384 32768
65536 131072 262144 524288 1048576 2097152 4194304 8388608
16777216 33554432 67108864 134217728 268435456 536870912 1073741824 2147483648
4294967296 8589934592 17179869184 34359738368 68719476736 137438953472 274877906944 549755813888
1099511627776 2199023255552 4398046511104 8796093022208 17592186044416 35184372088832 70368744177664 140737488355328
281474976710656 562949953421312 1125899906842624 2251799813685248 4503599627370496 9007199254740992 18014398509481984 36028797018963968
72057594037927936 144115188075855872 288230376151711744 576460752303423488 1152921504606846976 2305843009213693952 4611686018427387904 9223372036854775808

Example: Let's say that a dictionary was not sorted. This means that you would have to start at the begining and look at every word until you found the definition you were looking for. This is an algorithm with linear complexity. The time it takes you to lookup a word in such a dictionary is related to the number of words in the dictionary: if you double the size of such a dictionary, you will double the amount of time it takes to lookup a word. In other words, the time to lookup a word in this dictionary is on the order of the size of the dictionary. This is expressed as O(n), where n is the size of the dictionary.

Example: Dictionaries are sorted before printing. This means that you can quickly find the word you are looking for. In terms of complexity we are more interested in how much longer it will take you to lookup a word if we double the size of the dictionary. In other words, the Oxford English Dictionary (OED) is about 8 times larger than a more abridged English dictionary. However, it only takes about 3 times longer to lookup a word in the OED. As the problem size grows, the amount of effort it takes to figure out the problem grows less slowly. If the OED were 16-times larger, then it would take only 4-times longer to search. If the OED were 32-times larger, it would take only 5-times longer to search. This mathematical relationship is known as a logarithm. The increase in computing power needed to solve such a problem grows on the order of the logarithm of size of the problem. This is expressed as O(logn). Logarithm problems are much easier to solve than linear ones, which is why we sort dictionaries.

Example: The chessboard problem mentioned above is similar to encryption keys. Every additional square on the chessboard doubles the size of the problem; every additional bit added to a key doubles the amount of time it would take to crack it. This means that a 32-bit key would take roughly a billion trials in order to crack, a 64-bit key would be roughly a billion times harder than that to crack, and a 128-bit key is a billion billion times harder to crack than a 64-bit key. This complexity is expressed as O(2n).

Key point: The following table shows the complexity of some algorithms.

big-O complexity problem = 8 elements problem = 32 elements
O(logn) logarithmic 3 seconds 5 seconds
O(n) linear 8 seconds 32 seconds
O(n2) quadratic 1 minute 15 minutes
O(n3) cubic 9 minutes 9 hours
O(2n) exponential 4 minutes 136 years
Note that deceptive nature that for a problem size of 8, our exponential algorithm is actually faster than the cubic algorithm. But if you were to choose this in order to solve a problem of size 32, then it would not complete in your life-time.

compression [1]
Since encrypted data is essentially random, you cannot compress it. This defeats networking standards designed to automatically encrypt traffic (such as dial-up modems). Therefore, data must be compressed before it is encrypted. For this reason, compression is becoming an automatic feature to most encryption products. The most often used compression standard is gzip and its compression library zlib.

con [2]
Slang term for convention. Popular conventions are:
DEFCON
Held in the summer in Las Vegas.
HOPE
"Hackers On Planet Earth" put on by 2600 magazine.

confidentiality [3] One of the six major areas of infosec, confidentiality is the area concerned with keeping secrets.

Contrast: For the most part, the words confidentiality and privacy are interchangeable. We typically apply the word privacy to individuals, and include ideas like anonymity and unobservability. We use works like confidentiality to refer governments and corporations who wish to defend against eavesdropping.

Key point: We use encryption to protect secrets from being eavesdropped.

cookie [1] .
Cookies are small bits of data that a website can place on your system, requesting your browser to send them back to the website the next time you visit. Cookies are a way of making personalizing website, and in general making the whole web experience better.

Misconception: Cookies are not a security/privacy risk. However, when combined with HTTP Referer field and cross-site imbedded images, they can be used to track user's activities. Users have sued sites like DoubleClick that have massive cross-site imbedded images over the privacy information they collect. Cookies receive most of the blame for this.

Example: The biggest privacy hole is when cookies are combined with the HTTP Referer field. If many sites imbed images (like advertisements) from a single site, that single site can use cookies in order to track a user going among those sites. The cookie does not identify who the user is, but can track what the user does. Other information, like web-site logons, can then be combined with this information in order to track how the person is.

Example: JavaScript has a long history of problems with cookies such that one website can retrieve the cookie information for another website. Since cookie information often contains username/password information, this can compromise the site.

Key point: Turning off cookies is not practical. The best you can hope for is "cookie management" -- choose which sites you want to allow cookies for but deny them to all the rest.

covert channel [4]

Key point: One rootkit uses ICMP as a covert channel. It creates a virtual TCP-like circuit inside of ping packets.

Key point: Covert channels can become extremely covert. In theory, one can create a covert channel where only the IP identification field (16-bits) carries the data.

Key point: URLs and DNS queries pass through virtually everything (including proxies). Therefore, it is easy to export information from inside a company to the outside using this technique.

crack [2]
To decrypt a password, or to bypass a copy protection scheme. See crackz for more about copy protection.

History: When the UNIX operating system was first developed, passwords were stored in the file /etc/passwd. This file was made readable by everyone, but the passwords were encrypted so that a user could not figure out who a person's password was. The passwords were encrypted in such a manner that you could test a password to see if it was valid, but you really couldn't decrypt the entry. (Note: not even administrators are able to figure out user's passwords; they can change them, but not decrypt them). However, a program called "crack" was developed that would simply test all the words in the dictionary against the passwords in /etc/passwd. This would find all user accounts whose passwords where chosen from the dictionary. Typical dictionaries also included people's names since a common practice is to choose a spouse's or child's name.

Contrast: A "crack" program is one that takes existing encrypted passwords and attempts to find some that are "weak" and easily discovered. However, it is not a "password guessing" program that tries to login with many passwords, that is known as a grind

Key point: The sources of encrypted passwords typically include the following:

  • /etc/passwd from a UNIX system
  • SAM or SAM._ from a Windows NT system
  • <username>.pwl from a Windows 95/98 system
  • sniffed challenge hashes from the network

Key point: The "crack" program is a useful tool for system administrators. By running the program on their own systems, they can quickly find users who have chosen weak passwords. In other words, it is a policy enforcement tool.

Tools: on UNIX, the most commonly used program is called simply "crack". On Windows, a popular program is called "l0phtCrack" from http://www.l0pht.com/.

cracker [1]
A specific type of hacker who decrypts passwords or breaks software copy protection schemes (creating "crackz").

Controversy: See the word hacker for a disagreement about the way that "cracker" is used in the computer enthusiast community vs. the security community.

CRC (Cyclic Redundancy Check)[2]
A form of a checksum that is able to detect accidental transmission errors. It is used on Ethernet in order to detect packet errors. It is also used on some operating systems in order to detect accidental errors in programs before running them.

Key point: Like a checksum, a CRC is not able to detect intentional changes.

crackz [2]
Patches for programs that bypass copy protection schemes.

Culture: Cracking programs is its own little underground 'scene' independent of other hacking activities. Groups and individuals often compete to be the first to break a new copy protection scheme in popular programs. There are many sites that catalogue cracked programs.

credentials [4]
Your authentication information, such as a password, token, or certificate. Sin