Tuesday, May 29, 2012

Binary Tree


Binary Trees

Introduction

We extend the concept of linked data structures to structure containing nodes with more than one self-referenced field. A binary tree is made of nodes, where each node contains a "left" reference, a "right" reference, and a data element. The topmost node in the tree is called the root.Every node (excluding a root) in a tree is connected by a directed edge from exactly one other node. This node is called a parent. On the other hand, each node can be connected to arbitrary number of nodes, called children. Nodes with no children are called leaves, or external nodes. Nodes which are not leaves are called internal nodes. Nodes with the same parent are called siblings.  
More tree terminology:
  • The depth of a node is the number of edges from the root to the node.
  • The height of a node is the number of edges from the node to the deepest leaf.
  • The height of a tree is a height of the root.
  • A full binary tree.is a binary tree in which each node has exactly zero or two children.
  • A complete binary tree is a binary tree, which is completely filled, with the possible exception of the bottom level, which is filled from left to right.
A complete binary tree is very special tree, it provides the best possible ratio between the number of nodes and the height. The height h of a complete binary tree with N nodes is at most O(log N). We can easily prove this by counting nodes on each level, starting with the root, assuming that each level has the maximum number of nodes:
n = 1 + 2 + 4 + ... + 2h-1 + 2h = 2h+1 - 1
Solving this with respect to h, we obtain
h = O(log n)
where the big-O notation hides some superfluous details.

Advantages of trees

Trees are so useful and frequently used, because they have some very serious advantages:
  • Trees reflect structural relationships in the data
  • Trees are used to represent hierarchies
  • Trees provide an efficient insertion and searching
  • Trees are very flexible data, allowing to move subtrees around with minumum effort

Traversals

A traversal is a process that visits all the nodes in the tree. Since a tree is a nonlinear data structure, there is no unique traversal. We will consider several traversal algorithms with we group in the following two kinds
  • depth-first traversal
  • breadth-first traversal
There are three different types of depth-first traversals, :
  • PreOrder traversal - visit the parent first and then left and right children;
  • InOrder traversal - visit the left child, then the parent and the right child;
  • PostOrder traversal - visit left child, then the right child and then the parent;
There is only one kind of breadth-first traversal--the level order traversal. This traversal visits nodes by levels from top to bottom and from left to right.
As an example consider the following tree and its four traversals:

PreOrder - 8, 5, 9, 7, 1, 12, 2, 4, 11, 3
InOrder - 9, 5, 1, 7, 2, 12, 8, 4, 3, 11
PostOrder - 9, 1, 2, 12, 7, 5, 3, 11, 4, 8
LevelOrder - 8, 5, 4, 9, 7, 11, 1, 12, 3, 2
    

In the next picture we demonstarte the order of node visitation. Number 1 denote the first node in a particular traversal and 7 denote the last node.
These common traversals can be represented as a single algorithm by assuming that we visit each node three times. An Euler tour is a walk around the binary tree where each edge is treated as a wall, which you cannot cross. In this walk each node will be visited either on the left, or under the below, or on the right. The Euler tour in which we visit nodes on the left produces a preorder traversal. When we visit nodes from the below, we get an inorder traversal. And when we visit nodes on the right, we get a postorder traversal.

Binary Search Trees

We consider a particular kind of a binary tree called a Binary Search Tree (BST). The basic idea behind this data structure is to have such a storing repository that provides the efficient way of data sorting, searching and retriving.
  A BST is a binary tree where nodes are ordered in the following way:
  • each node contains one key (also known as data)
  • the keys in the left subtree are less then the key in its parent node, in short L < P;
  • the keys in the right subtree are greater the key in its parent node, in short P < R;
  • duplicate keys are not allowed.
In the following tree all nodes in the left subtree of 10 have keys < 10 while all nodes in the right subtree > 10. Because both the left and right subtrees of a BST are again search trees; the above definition is recursively applied to all internal nodes:

Implementation

We implement a binary search tree using a private inner class BSTNode. In order to support the binary search tree property, we require that data stored in each node is Comparable:
public class BST <AnyType extends Comparable<AnyType>>
{
   private Node<AnyType> root;

   private class Node<AnyType>
   {
      private AnyType data;
      private Node<AnyType> left, right;

      public Node(AnyType data)
      {
         left = right = null;
         this.data = data;
      }
   }
   ...

}

Insertion

The insertion procedure is quite similar to searching. We start at the root and recursively go down the tree searching for a location in a BST to insert a new node. If the element to be inserted is already in the tree, we are done (we do not insert duplicates). The new node will always replace a NULL reference.
Exercise. Given a sequence of numbers:
11, 6, 8, 19, 4, 10, 5, 17, 43, 49, 31
Draw a binary search tree by inserting the above numbers from left to right.

Searching

Searching in a BST always starts at the root. We compare a data stored at the root with the key we are searching for (let us call it as toSearch). If the node does not contain the key we proceed either to the left or right child depending upon comparison. If the result of comparison is negative we go to the left child, otherwise - to the right child. The recursive structure of a BST yields a recursive algorithm.
Searching in a BST has O(h) worst-case runtime complexity, where h is the height of the tree. Since s binary search tree with n nodes has a minimum of O(log n) levels, it takes at least O(log n) comparisons to find a particular node. Unfortunately, a binary serch tree can degenerate to a linked list, reducing the search time to O(n).

Deletion

Deletion is somewhat more tricky than insertion. There are several cases to consider. A node to be deleted (let us call it as toDelete)
  • is not in a tree;
  • is a leaf;
  • has only one child;
  • has two children.
If toDelete is not in the tree, there is nothing to delete. If toDelete node has only one child the procedure of deletion is identical to deleting a node from a linked list - we just bypass that node being deleted
Deletion of an internal node with two children is less straightforward. If we delete such a node, we split a tree into two subtrees and therefore, some children of the internal node won't be accessible after deletion. In the picture below we delete 8:
Deletion starategy is the following: replace the node being deleted with the largest node in the left subtree and then delete that largest node. By symmetry, the node being deleted can be swapped with the smallest node is the right subtree.
See BST.java for a complete implementation.
Exercise. Given a sequence of numbers:
11, 6, 8, 19, 4, 10, 5, 17, 43, 49, 31
Draw a binary search tree by inserting the above numbers from left to right and then show the two trees that can be the result after the removal of 11.

Non-Recursive Traversals

Depth-first traversals can be easily implemented recursively.A non-recursive implementation is a bit more difficult. In this section we implement a pre-order traversal as a tree iterator
public Iterator<AnyType> iterator()
{
  return new PreOrderIterator();
}
where the PreOrderIterator class is implemented as an inner private class of the BST class
private class PreOrderIterator implements Iterator<AnyType>
{
  ...
}
The main difficulty is with next() method, which requires the implicit recursive stack implemented explicitly. We will be using Java's Stack. The algorithm starts with the root and push it on a stack. When a user calls for the next() method, we check if the top element has a left child. If it has a left child, we push that child on a stack and return a parent node. If there is no a left child, we check for a right child. If it has a right child, we push that child on a stack and return a parent node. If there is no right child, we move back up the tree (by popping up elements from a stack) until we find a node with a right child. Here is the next()implementation
public AnyType next()
{
   Node cur = stk.peek();
   if(cur.left != null)
   {
      stk.push(cur.left);
   }
   else
   {
      Node tmp = stk.pop();
      while(tmp.right == null)
      {
         if (stk.isEmpty()) return cur.data;
         tmp = stk.pop();
      }
      stk.push(tmp.right);
   }
   return cur.data;
}
The following example.shows the output and the state of the stack during each call to next(). Note, the algorithm works on any binary trees, not necessarily binary search trees..
          
  Output     1    2    4    6    5    7    8    3  
  Stack    1  2
1
4
2
1
6
4
2
1
5
1
7
5
1
8
1
3 

A non-recursive preorder traversal can be eloquently implemented in just three lines of code. If you understand next()'s implementation above, it should be no problem to grasp this one:
public AnyType next()
{
   if (stk.isEmpty()) throw new java.util.NoSuchElementException();

   Node cur = stk.pop();
   if(cur.right != null) stk.push(cur.right);
   if(cur.left != null) stk.push(cur.left);

   return cur.data;
}
Note, we push the right child before the left child.

Level Order Traversal

Level order traversal processes the nodes level by level. It first processes the root, and then its children, then its grandchildren, and so on. Unlike the other traversal methods, a recursive version does not exist.
A traversal algorithm is similar to the non-recursive preorder traversal algorithm. The only difference is that a stack is replaced with a FIFO queue.
See BST.java for a complete implementation.

Tuesday, May 22, 2012

CSMA with Collision Avoidance



We have observed that CSMA/CD would break down in wireless networks because of hidden node and exposed nodes problems. We will have a quick recap of these two problems through examples.

Hidden Node Problem

In the case of wireless network it is possible that A is sending a message to B, but C is out of its range and hence while "listening" on the network it will find the network to be free and might try to send packets to B at the same time as A. So, there will be a collision at B. The problem can be looked upon as if A and C are hidden from each other. Hence it is called the "hidden node problem".

Exposed Node Problem

If C is transmitting a message to D and B wants to transmit a message to A, B will find the network to be busy as B hears C trnasmitting. Even if B would have transmitted to A, it would not have been a problem at A or D. CSMA/CD would not allow it to transmit message to A, while the two transmissions could have gone in parallel. 

Addressing hidden node problem (CSMA/CA)

Consider the figure above.Suppose A wants to send a packet to B. Then it will first send a small packet to B called "Request to Send" (RTS). In response, B sends a small packet to A called "Clear to Send" (CTS). Only after A receives a CTS, it transmits the actual data. Now, any of the nodes which can hear either CTS or RTS assume the network to be busy. Hence even if some other node which is out of range of both A and B sends an RTS to C (which can hear at least one of the RTS or CTS between A and B), C would not send a CTS to it and hence the communication would not be established between C and D.
One issue that needs to be addressed is how long the rest of the nodes should wait before they can transmit data over the network. The answer is that the RTS and CTS would carry some information about the size of the data that B intends to transfer. So, they can calculate time that would be required for the transmission to be over and assume the network to be free after that.Another interesting issue is what a node should do if it hears RTS but not a corresponding CTS. One possibility is that it assumes the recipient node has not responded and hence no transmission is going on, but there is a catch in this. It is possible that the node hearing RTS is just on the boundary of the node sending CTS. Hence, it does hear CTS but the signal is so deteriorated that it fails to recognize it as a CTS. Hence to be on the safer side, a node will not start transmission if it hears either of an RTS or a CTS.
The assumption made in this whole discussion is that if a node X can send packets to a node Y, it can also receive a packet from Y, which is a fair enough assumption given the fact that we are talking of a local network where standard instruments would be used. If that is not the case additional complexities would get introduced in the system.

Does CSMA/CD work universally in the wired networks ?

The problem of range is there in wired networks as well in the form of deterioration of signals. Normally to counter this, we use repeaters, which can regenerate the original signal from a deteriorated one. But does that mean that we can build as long networks as we want with repeaters. The answer, unfortunately, is NO! The reason is the beyond a certain length CSMA/CD will break down.
The mechanism of collision detection which CSMA/CD follows is through listening while talking. What this means is so long as a node is transmitting the packet, it is listening on the cable. If the data it listens to is different from the data it is transmitting it assumes a collision. Once it has stopped transmitting the packet, and has not detected collision while transmission was going on, it assumes that the transmission was successful. The problem arises when the distance between the two nodes is too large. Suppose A wants to transmit some packet to B which is at a very large distance from B. Data can travel on cable only at a finite speed (usually 2/3c, c being the speed of light). So, it is possible that the packet has been transmitted by A onto the cable but the first bit of the packet has not yet reached B. In that case, if a collision occurs, A would be unaware of it occurring. Therefore there is problem in too long a network.
Let us try to parametrize the above problem. Suppose "t" is the time taken for the node A to transmit the packet on the cable and "T" is the time , the packet takes to reach from A to B. Suppose transmission at A starts at time t0. In the worst case the collision takes place just when the first packet is to reach B. Say it is at t0+T-e (e being very small). Then the collision information will take T-e time to propagate back to A. So, at t0+2(T-e) A should still be transmitting. Hence, for the correct detection of collision (ignoring e)

t > 2T

t increases with the number of bits to be transferred and decreases with the rate of transfer (bits per second). T increases with the distance between the nodes and decreases with the speed of the signal (usually 2/3c). We need to either keep t large enough or T as small. We do not want to live with lower rate of bit transfer and hence slow networks. We can not do anything about the speed of the signal. So what we can rely on is the minimum size of the packet and the distance between the two nodes. Therefore, we fix some minimum size of the packet and if the size is smaller than that, we put in some extra bits to make it reach the minimum size. Accordingly we fix the maximum distance between the nodes. Here too, there is a tradeoff to be made. We do not want the minimum size of the packets to be too large since that wastes lots of resources on cable. At the same time we do not want the distance between the nodes to be too small. Typical minimum packet size is 64 bytes and the corresponding distance is 2-5 kilometers.

Collision Free Protocols

Although collisions do not occur with CSMA/CD once a station has unambigously seized the channel, they can still occur during the contention period. These collisions adversely affect the efficiency of transmission. Hence some protocols have been developed which are contention free.

Bit-Map Method

In this method, there N slots. If node 0 has a frame to send, it transmit a 1 bit during the first slot. No other node is allowed to transmit during this period. Next node 1 gets a chance to transmit 1 bit if it has something to send, regardless of what node 0 had transmitted. This is done for all the nodes. In general node j may declare the fact that it has a frsme to send by inserting a 1 into slot j. Hence after all nodes have passed, each node has complete knowledge of who wants to send a frame. Now they begin transmitting in numerical order. Since everyone knows who is transmitting and when, there could never be any collision.
The basic problem with this protocol is its inefficiency during low load. If a node has to transmit and no other node needs to do so, even then it has to wait for the bitmap to finish. Hence the bitmap will be repeated over and over again if very few nodes want to send wasting valuable bandwidth.

Binary Countdown

In this protocol, a node which wants to signal that it has a frame to send does so by writing its address into the header as a binary number. The arbitration is such that as soon as a node sees that a higher bit position that is 0 in its address has been overwritten with a 1, it gives up. The final result is the address of the node which is allowed to send. After the node has transmitted the whole process is repeated all over again. Given below is an example situation.
NodesAddresses
A0010
B0101
C1010
D1001
----
1010
Node C having higher priority gets to transmit. The problem with this protocol is that the nodes with higher address always wins. Hence this creates a priority which is highly unfair and hence undesirable.

Limited Contention Protocols

Both the type of protocols described above - Contention based and Contention - free has their own problems. Under conditions of light load, contention is preferable due to its low delay. As the load increases, contention becomes increasingly less attractive, because the overload associated with channel arbitration becomes greater. Just the reverse is true for contention - free protocols. At low load, they have high delay, but as the load increases , the channel efficiency improves rather than getting worse as it does for contention protocols.
Obviously it would be better if one could combine the best properties of the contention and contention - free protocols, that is, protocol which used contention at low loads to provide low delay, but used a cotention-free technique at high load to provide good channel efficiency. Such protocols do exist and are called Limited contention protocols.
It is obvious that the probablity of some station aquiring the channel could only be increased by decreasing the amount of competition. The limited contention protocols do exactly that. They first divide the stations up into ( not necessarily disjoint ) groups. Only the members of group 0 are permitted to compete for slot 0. The competition for aquiring the slot within a group is contention based. If one of the members of that group succeeds, it aquires the channel and transmits a frame. If there is collision or no node of a particular group wants to send then the members of the next group compete for the next slot. The probablity of a particular node is set to a particular value ( optimum ).

Adaptive Tree Walk Protocol

The following is the method of adaptive tree protocol. Initially all the nodes are allowed to try to aquire the channel. If it is able to aquire the channel, it sends its frame. If there is collision then the nodes are divided into two equal groups and only one of these groups compete for slot 1. If one of its member aquires the channel then the next slot is reserved for the other group. On the other hand, if there is a collision then that group is again subdivided and the same process is followed. This can be better understood if the nodes are thought of as being organised in a binary tree as shown in the following figure. 

Many improvements could be made to the algorithm. For example, consider the case of nodes G and H being the only ones wanting to transmit. At slot 1 a collision will be detected and so 2 will be tried and it will be found to be idle. Hence it is pointless to probe 3 and one should directly go to 6,7.


Multiplexing

When two communicating nodes are connected through a media, it generally happens that bandwidth of media is several times greater than that of the communicating nodes. Transfer of a single signal at a time is both slow and expensive. The whole capacity of the link is not being utilized in this case. This link can be further exploited by sending several signals combined into one. This combining of signals into one is called multiplexing.
  1. Frequency Division Multiplexing (FDM): This is possible in the case where transmission media has a bandwidth than the required bandwidth of signals to be transmitted. A number of signals can be transmitted at the same time. Each source is allotted a frequency range in which it can transfer it's signals, and a suitable frequency gap is given between two adjescent signals to avoid overlapping. This is type of multiplexing is commonly seen in the cable TV networks.
  2. Time Division Multiplexing (TDM): This is possible when data transmission rate of the media is much higher than that of the data rate of the source. Multiple signals can be transmitted if each signal is allowed to be transmitted for a definite amount of time. These time slots are so small that all transmissions appear to be in parallel.
    1. Synchronous TDM: Time slots are preassigned and are fixed. Each source is given it's time slot at every turn due to it. This turn may be once per cycle, or several turns per cycle ,if it has a high data transfer rate, or may be once in a no. of cycles if it is slow. This slot is given even if the source is not ready with data. So this slot is transmitted empty.
    2. Asynchronous TDM: In this method, slots are not fixed. They are allotted dynamically depending on speed of sources, and whether they are ready for transmission.

Network Topologies

A network topology is the basic design of a computer network. It is very much like a map of a road. It details how key network components such as nodes and links are interconnected. A network's topology is comparable to the blueprints of a new home in which components such as the electrical system, heating and air conditioning system, and plumbing are integrated into the overall design. Taken from the Greek work "Topos" meaning "Place," Topology, in relation to networking, describes the configuration of the network; including the location of the workstations and wiring connections. Basically it provides a definition of the components of a Local Area Network (LAN). A topology, which is a pattern of interconnections among nodes, influences a network's cost and performance. There are three primary types of network topologies which refer to the physical and logical layout of the Network cabling. They are:
  1. Star Topology: All devices connected with a Star setup communicate through a central Hub by cable segments. Signals are transmitted and received through the Hub. It is the simplest and the oldest and all the telephone switches are based on this. In a star topology, each network device has a home run of cabling back to a network hub, giving each device a separate connection to the network. So, there can be multiple connections in parallel.

    Advantages
    • Network administration and error detection is easier because problem is isolated to central node
    • Networks runs even if one host fails
    • Expansion becomes easier and scalability of the network increases
    • More suited for larger networks
    Disadvantages
    • Broadcasting and multicasting is not easy because some extra functionality needs to be provided to the central hub
    • If the central node fails, the whole network goes down; thus making the switch some kind of a bottleneck
    • Installation costs are high because each node needs to be connected to the central switch
  2. Bus Topology: The simplest and one of the most common of all topologies, Bus consists of a single cable, called a Backbone, that connects all workstations on the network using a single line. All transmissions must pass through each of the connected devices to complete the desired request. Each workstation has its own individual signal that identifies it and allows for the requested data to be returned to the correct originator. In the Bus Network, messages are sent in both directions from a single point and are read by the node (computer or peripheral on the network) identified by the code with the message. Most Local Area Networks (LANs) are Bus Networks because the network will continue to function even if one computer is down. This topology works equally well for either peer to peer or client server.

    The purpose of the terminators at either end of the network is to stop the signal being reflected back.Advantages
    • Broadcasting and multicasting is much simpler
    • Network is redundant in the sense that failure of one node doesn't effect the network. The other part may still function properly
    • Least expensive since less amount of cabling is required and no network switches are required
    • Good for smaller networks not requiring higher speeds
    Disadvantages
    • Trouble shooting and error detection becomes a problem because, logically, all nodes are equal
    • Less secure because sniffing is easier
    • Limited in size and speed
  3. Ring Topology: All the nodes in a Ring Network are connected in a closed circle of cable. Messages that are transmitted travel around the ring until they reach the computer that they are addressed to, the signal being refreshed by each node. In a ring topology, the network signal is passed through each network card of each device and passed on to the next device. Each device processes and retransmits the signal, so it is capable of supporting many devices in a somewhat slow but very orderly fashion. There is a very nice feature that everybody gets a chance to send a packet and it is guaranteed that every node gets to send a packet in a finite amount of time.
    Advantages
    • Broadcasting and multicasting is simple since you just need to send out one message
    • Less expensive since less cable footage is required
    • It is guaranteed that each host will be able to transmit within a finite time interval
    • Very orderly network where every device has access to the token and the opportunity to transmit
    • Performs better than a star network under heavy network load
    Disadvantages
    • Failure of one node brings the whole network down
    • Error detection and network administration becomes difficult
    • Moves, adds and changes of devices can effect the network
    • It is slower than star topology under normal load
Generally, a BUS architecture is preferred over the other topologies - ofcourse, this is a very subjective opinion and the final design depends on the requirements of the network more than anything else. Lately, most networks are shifting towards the STAR topology. Ideally we would like to design networks, which physically resemble the STAR topology, but behave like BUS or RING topology.

Data Link Layer

Data link layer can be characterized by two types of layers:
  1. Medium Access Layer (MAL)
  2. Logical Link Layer

Aloha Protocols

History

The Aloha protocol was designed as part of a project at the University of Hawaii. It provided data transmission between computers on several of the Hawaiian Islands using radio transmissions.
  • Communications was typically between remote stations and a central sited named Menehune or vice versa.
  • All message to the Menehune were sent using the same frequency.
  • When it received a message intact, the Menehune would broadcast an ack on a distinct outgoing frequency.
  • The outgoing frequency was also used for messages from the central site to remote computers.
  • All stations listened for message on this second frequency.

Pure Aloha

Pure Aloha is an unslotted, fully-decentralized protocol. It is extremely simple and trivial to implement. The ground rule is - "when you want to talk, just talk!". So, a node which wants to transmits, will go ahead and send the packet on its broadcast channel, with no consideration whatsoever as to anybody else is transmitting or not. 

One serious drawback here is that, you dont know whether what you are sending has been received properly or not (so as to say, "whether you've been heard and understood?"). To resolve this, in Pure Aloha, when one node finishes speaking, it expects an acknowledgement in a finite amount of time - otherwise it simply retransmits the data. This scheme works well in small networks where the load is not high. But in large, load intensive networks where many nodes may want to transmit at the same time, this scheme fails miserably. This led to the development of Slotted Aloha.

Slotted Aloha

This is quite similar to Pure Aloha, differing only in the way transmissions take place. Instead of transmitting right at demand time, the sender waits for some time. This delay is specified as follows - the timeline is divided into equal slots and then it is required that transmission should take place only at slot boundaries. To be more precise, the slotted-Aloha makes the following assumptions:
  • All frames consist of exactly L bits.
  • Time is divided into slots of size L/R seconds (i.e., a slot equals the time to transmit one frame).
  • Nodes start to transmit frames only at the beginnings of slots.
  • The nodes are synchronized so that each node knows when the slots begin.
  • If two or more frames collide in a slot, then all the nodes detect the collision event before the slot ends.


In this way, the number of collisions that can possibly take place is reduced by a huge margin. And hence, the performance become much better compared to Pure Aloha. collisions may only take place with nodes that are ready to speak at the same time. But nevertheless, this is a substantial reduction.

Carrier Sense Mutiple Access Protocols

In both slotted and pure ALOHA, a node's decision to transmit is made independently of the activity of the other nodes attached to the broadcast channel. In particular, a node neither pays attention to whether another node happens to be transmitting when it begins to transmit, nor stops transmitting if another node begins to interfere with its transmission. As humans, we have human protocols that allow allows us to not only behave with more civility, but also to decrease the amount of time spent "colliding" with each other in conversation and consequently increasing the amount of data we exchange in our conversations. Specifically, there are two important rules for polite human conversation:
  1. Listen before speaking: If someone else is speaking, wait until they are done. In the networking world, this is termed carrier sensing - a node listens to the channel before transmitting. If a frame from another node is currently being transmitted into the channel, a node then waits ("backs off") a random amount of time and then again senses the channel. If the channel is sensed to be idle, the node then begins frame transmission. Otherwise, the node waits another random amount of time and repeats this process.
  2. If someone else begins talking at the same time, stop talking. In the networking world, this is termed collision detection - a transmitting node listens to the channel while it is transmitting. If it detects that another node is transmitting an interfering frame, it stops transmitting and uses some protocol to determine when it should next attempt to transmit.
It is evident that the end-to-end channel propagation delay of a broadcast channel - the time it takes for a signal to propagate from one of the the channel to another - will play a crucial role in determining its performance. The longer this propagation delay, the larger the chance that a carrier-sensing node is not yet able to sense a transmission that has already begun at another node in the network.

CSMA- Carrier Sense Multiple Access

This is the simplest version CSMA protocol as described above. It does not specify any collision detection or handling. So collisions might and WILL occur and clearly then, this is not a very good protocol for large, load intensive networks.
So, we need an improvement over CSMA - this led to the development of CSMA/CD.

CSMA/CD- CSMA with Collision Detection

In this protocol, while transmitting the data, the sender simultaneously tries to receive it. So, as soon as it detects a collission (it doesn't receive its own data) it stops transmitting. Thereafter, the node waits for some time interval before attempting to transmit again. Simply put, "listen while you talk". But, how long should one wait for the carrier to be freed? There are three schemes to handle this:
  1. 1-Persistent: In this scheme, transmission proceeds immediately if the carrier is idle. However, if the carrier is busy, then sender continues to sense the carrier until it becomes idle. The main problem here is that, if more than one transmitters are ready to send, a collision is GUARANTEED!!
  2. Non-Persistent: In this scheme, the broadcast channel is not monitored continuously. The sender polls it at random time intervals and transmits whenever the carrier is idle. This decreases the probability of collisions. But, it is not efficient in a low load situation, where number of collisions are anyway small. The problems it entails are:
    • If back-off time is too long, the idle time of carrier is wasted in some sense
    • It may result in long access delays
  3. p-Persistent: Even if a sender finds the carrier to be idle, it uses a probabilistic distribution to determine whether to transmit or not. Put simply, "toss a coin to decide". If the carrier is idle, then transmission takes place with a probability p and the sender waits with a probability 1-p. This scheme is a good trade off between the Non-persistent and 1-persistent schemes. So, for low load situations, p is high (example: 1-persistent); and for high load situations, p may be lower. Clearly, the value of p plays an important role in determining the performance of this protocol. Also the same p is likely to provide different performance at different loads.
CSMA/CD doesn't work in some wireless scenarios called "hidden node" problems. Consider a situation, where there are 3 nodes - A, B and C communicating with each other using a wireless protocol. Morover, B can communicate with both A and C, but A and C lie outside each other's range and hence can't communicate directly with each other. Now, suppose both A and C want to communicate with B simultaneously. They both will sense the carrier to be idle and hence will begin transmission, and even if there is a collision, neither A nor C will ever detect it. B on the other hand will receive 2 packets at the same time and might not be able to understand either of them. To get around this problem, a better version called CSMA/CA was developed, specially for wireless applications.


Session Layer

It deals with the concept of Sessions i.e. when a user logins to a remote server he should be authenticated before getting access to the files and application programs. Another job of session layer is to establish and maintain sessions. If during the transfer of data between two machines the session breaks down, it is the session layer which re-establishes the connection. It also ensures that the data transfer starts from where it breaks keeping it transparent to the end user. e.g. In case of a session with a database server, this layer introduces check points at various places so that in case the connectoin is broken and reestablished, the transition running on the database is not lost even if the user has not committed. This activity is called Synchronization. Another function of this layer is Dialogue Control which determines whose turn is it to speak in a session. It is useful in video conferencing.

Presentation Layer

This layer is concerned with the syntax and semantics of the information transmitted. In order to make it possible for computers with different data representations to communicate data structures to be exchanged can be defined in abstract way alongwith standard encoding. It also manages these abstract data structres and allows higher level of data structres to be defined an exchange. It encodes the data in standard agreed way(network format). Suppose there are two machines A and B one follows 'Big Endian' and other 'Little Endian' for data representation. This layer ensures that the data transmitted by one gets converted in the form compatibale to othe machine. This layer is concerned with the syntax and semantics of the information transmitted.In order to make it possible for computers with different data representations to communicate data structures to be exchanged canbe defined in abstract way alongwith standard encoding. It also manages these abstract data structres and allows higher level of data structres to be defined an exchange. Other functions include compression, encryption etc.

Application Layer

The seventh layer contains the application protocols with which the user gains access to the network. The choice of which specific protocols and their associated functions are to be used at the application level is up to the individual user. Thus the boundary between the presentation layer and the application layer represents a separation of the protocols imposed by the network designers from those being selected and implemented by the network users.For example commonly used protocols are HTTP(for web browsing), FTP(for file transfer) etc.

Network Layers as in Practice

In most of the networks today, we do not follow the OSI model of seven layers. What is actually implemented is as follows. The functionality of Application layer and Presentation layer is merged into one and is called as the Application Layer. Functionalities of Session Layer is not implemented in most networks today. Also, the Data Link layer is split theoretically into MAC (Medium Access Control) Layer andLLC (Link Layer Control). But again in practice, the LLC layer is not implemented by most networks. So as of today, the network architecture is of 5 layers only.


Network Layers in Internet Today

Some Related Links on OSI Model and TCP Model


Physical Layer

Physical layer is concerned with transmitting raw bits over a communication channel. The design issues have to do with making sure that when one side sends a 1 bit, it is recieved by the other side as 1 bit and not as 0 bit. In physical layer we deal with the communication medium used for transmission.

Types of Medium

Medium can be classified into 2 categories.
  1. Guided Media : Guided media means that signals is guided  by the prescence of physical media i.e. signals are under control and remains in the physical wire. For eg. copper wire.
  2. Unguided Media : Unguided Media means that there is no physical path for the signal to propogate. Unguided media are essentially electro-magnetic waves. There is no control on flow of signal. For eg. radio waves.

Communication Links

In a nework nodes are connected through links. The communication through links can be classified as
  1. Simplex : Communication can take place only in one direction. eg. T.V broadcasting.
  2. Half-duplex : Communication can take place in one direction at a time. Suppose node A and B are connected then half-duplex communication means that at a time data can flow from A to B or from B to A but not simultaneously. eg. two persons talking to each other such that when speaks the other listens and vice versa.
  3. Full-duplex : Communication can take place simultaneously in both directions. eg. A discussion in a group without discipline.
Links can be further classified as
  1. Point to Point : In this communication only two nodes are connected to each other. When a node sends a packet then it can be recieved only by the node on the other side and none else.
  2. Multipoint : It is a kind of sharing communication, in which signal can be recieved by all nodes. This is also called broadcast.
Generally two kind of problems are associated in transmission of signals.
  1. Attenuation : When a signal transmitts in a network then the quality of signal degrades as the signal travels longer distances in the wire. This is called attenuation. To improve quality of signal amplifiers are used at regular distances.
  2. Noise : In a communication channel many signals transmits simultaneously, certain random signals are also present in the medium. Due to interference of these signals our signal gets disrupted a bit.

Bandwidth

Bandwidth simply means how many bits can be transmitted per second in the communication channel. In technical terms it indicates the width of frequency spectrum.

Transmission Media

Guided Transmission Media
In Guided transmission media generally two kind of materials are used.
  1. Copper
    • Coaxial Cable
    • Twisted Pair
  2. Optical Fiber
  1. Coaxial Cable: Coaxial cable consists of an inner conductor and an outer conductor which are seperated by an insulator. The inner conductor is usually copper. The outer conductor is covered by a plastic jacket. It is named coaxial because the two conductors are coaxial. Typical diameter of coaxial cable lies between 0.4 inch to 1 inch. The most application of coaxial cable is cable T.V. The coaxial cable has high bandwidth, attenuation is less.
  2. Twisted Pair: A Twisted pair consists of two insulated copper wires, typically 1mm thick. The wires are twisted togather in a helical form the purpose of twisting is to reduce cross talk interference between several pairs. Twisted Pair is much cheaper then coaxial cable but it is susceptible to noise and electromagnetic interference and attenuation is large.

    Twisted Pair can be further classified in two categories:
    Unshielded twisted pair: In this no insulation is provided, hence they are susceptible to interference.
    Shielded twisted pair: In this a protective thick insulation is provided but shielded twisted pair is expensive and not commonly used.The most common application of twisted pair is the telephone system. Nearly all telephones are connected to the telephone company office by a twisted pair. Twisted pair can run several kilometers without amplification, but for longer distances repeaters are needed. Twisted pairs can be used for both analog and digital transmission. The bandwidth depends on the thickness of wire and the distance travelled. Twisted pairs are generally limited in distance, bandwidth and data rate.
  3. Optical Fiber: In optical fiber light is used to send data. In general terms prescence of light is taken as bit 1 and its absence as bit 0. Optical fiber consists of inner core of either glass or plastic. Core is surrounded by cladding of the same material but of different refrective index. This cladding is surrounded by a plastic jacket which prevents optical fiber from electromagnetic interferrence and harshy environments. It uses the principle of total internal reflection to transfer data over optical fibers. Optical fiber is much better in bandwidth as compared to copper wire, since there is hardly any attenuation or electromagnetic interference in optical wires. Hence there is less requirement to improve quality of signal, in long distance transmission. Disadvantage of optical fiber is that end points are fairly expensive. (eg. switches)Differences between different kinds of optical fibers:
    1. Depending on material
      • Made of glass
      • Made of plastic.
    2. Depending on radius
      • Thin optical fiber
      • Thick optical fiber
    3. Depending on light source
      • LED (for low bandwidth)
      • Injection lased diode (for high bandwidth)

Wireless Transmission

  1. Radio: Radio is a general term that is used for any kind of frequency. But higher frequencies are usually termed as microwave and the lower frequency band comes under radio frequency. There are many application of radio. For eg. cordless keyboard, wireless LAN, wireless ethernet. but it is limited in range to only a few hundred meters. Depending on frequency radio offers different bandwidths.
  2. Terrestrial microwave: In terrestrial microwave two antennas are used for communication. A focused beam emerges from an antenna and is recieved by the other antenna, provided that antennas should be facing each other with no obstacle in between. For this reason antennas are situated on high towers. Due to curvature of earth terristial microwave can be used for long distance communication with high bandwidth. Telecom department is also using this for long distance communication. An advantage of wireless communication is that it is not required to lay down wires in the city hence no permissions are required.
  3. Satellite communication: Satellite acts as a switch in sky. On earth VSAT(Very Small Aperture Terminal) are used to transmit and recieve data from satellite. Generally one station on earth transmitts signal to satellite and it is recieved by many stations on earth. Satellite communication is generally used in those places where it is very difficult to obtain line of sight i.e. in highly irregular terristial regions. In terms of noise wireless media is not as good as the wired media. There are frequency band in wireless communication and two stations should not be allowed to transmit simultaneously in a frequency band. The most promising advantage of satellite is broadcasting. If satellites are used for point to point communication then they are expensive as compared to wired media.



This lecture introduces the ISO-OSI layered architecture of Networks. According to the ISO standards, networks have been divided into 7 layers depending on the complexity of the fucntionality each of these layers provide. The detailed description of each of these layers is given in the notes below. We will first list the layers as defined by the standard in the increasing order of function complexity:

Physical Layer

This layer is the lowest layer in the OSI model. It helps in the transmission of data between two machines that are communicating through a physical medium, which can be optical fibres,copper wire or wireless etc. The following are the main functions of the physical layer:
  1. Hardware Specification: The details of the physical cables, network interface cards, wireless radios, etc are a part of this layer.

    Coaxial CableHybrid CableWireless CardNetwork Card
  2. Encoding and Signalling: How are the bits encoded in the medium is also decided by this layer. For example, on the coppar wire medium, we can use differnet voltage levels for a certain time interval to represent '0' and '1'. We may use +5mV for 1nsec to represent '1' and -5mV for 1nsec to represent '0'. All the issues of modulation is dealt with in this layer. eg, we may use Binary phase shift keying for the representation of '1' and '0' rather than using different volatage levels if we have to transfer in RF waves.


    Binary Phase Shift Keying
  3. Data Transmission and Reception: The transfer of each bit of data is the responsibility of this layer. This layer assures the transmissoin of each bit with a high probability. The transmission of the bits is not completely reliable as their is no error correction in this layer.
  4. Topology and Network Design: The network design is the integral part of the physical layer. Which part of the network is the router going to be placed, where the switches will be used, where we will put the hubs, how many machines is each switch going to handle, what server is going to be placed where, and many such concerns are to be taken care of by the physical layer. The variosu kinds of netopologies that we decide to use may be ring, bus, star or a hybrid of these topologies depending on our requirements.

Data Link Layer

This layer provides reliable transmission of a packet by using the services of the physical layer which transmits bits over the medium in an unreliable fashion. This layer is concerned with :
  1. Framing : Breaking input data into frames (typically a few hundred bytes) and caring about the frame boundaries and the size of each frame.
  2. Acknowledgment : Sent by the receiving end to inform the source that the frame was received without any error.
  3. Sequence Numbering : To acknowledge which frame was received.
  4. Error Detection : The frames may be damaged, lost or duplicated leading to errors.The error control is on link to link basis.
  5. Retransmission : The packet is retransmitted if the source fails to receive acknowledgment.
  6. Flow Control : Necessary for a fast transmitter to keep pace with a slow receiver.


Data Link Layer

Network Layer

Its basic functions are routing and congestion control.
Routing: This deals with determining how packets will be routed (transferred) from source to destination. It can be of three types :
  • Static : Routes are based on static tables that are "wired into" the network and are rarely changed.
  • Dynamic : All packets of one application can follow different routes depending upon the topology of the network, the shortest path and the current network load.
  • Semi-Dynamic : A route is chosen at the start of each conversation and then all the packets of the application follow the same route.


Routing

The services provided by the network can be of two types :
  • Connection less service: Each packet of an application is treated as an independent entity. On each packet of the application the destination address is provided and the packet is routed.
  • Connection oriented service: Here, first a connection is established and then all packets of the application follow the same route. To understand the above concept, we can also draw an analogy from the real life. Connection oriented service is modeled after the telephone system. All voice packets go on the same path after the connection is established till the connection is hung up. It acts like a tube ; the sender pushes the objects in at one end and the receiver takes them out in the same order at the other end. Connection less service is modeled after the postal system. Each letter carries the destination address and is routed independent of all the others. Here, it is possible that the letter sent first is delayed so that the second letter reaches the destination before the first letter.
Congestion Control: A router can be connected to 4-5 networks. If all the networks send packet at the same time with maximum rate possible then the router may not be able to handle all the packets and may drop some/all packets. In this context the dropping of the packets should be minimized and the source whose packet was dropped should be informed. The control of such congestion is also a function of the network layer. Other issues related with this layer are transmitting time, delays, jittering. 

Internetworking: Internetworks are multiple networks that are connected in such a way that they act as one large network, connecting multiple office or department networks. Internetworks are connected by networking hardware such as routers, switches, and bridges.Internetworking is a solution born of three networking problems: isolated LANs, duplication of resources, and the lack of a centralized network management system. With connected LANs, companies no longer have to duplicate programs or resources on each network. This in turn gives way to managing the network from one central location instead of trying to manage each separate LAN. We should be able to transmit any packet from one network to any other network even if they follow different protocols or use different addressing modes.


Inter-Networking

Network Layer does not guarantee that the packet will reach its intended destination. There are no reliability guarantees.

Transport Layer

Its functions are :
  • Multiplexing / Demultiplexing : Normally the transport layer will create distinct network connection for each transport connection required by the session layer. The transport layer may either create multiple network connections (to improve throughput) or it may multiplex several transport connections onto the same network connection (because creating and maintaining networks may be expensive). In the latter case, demultiplexing will be required at the receiving end. A point to note here is that communication is always carried out between two processes and not between two machines. This is also known as process-to-process communication.
  • Fragmentation and Re-assembly : The data accepted by the transport layer from the session layer is split up into smaller units (fragmentation) if needed and then passed to the network layer. Correspondingly, the data provided by the network layer to the transport layer on the receiving side is re-assembled.

    FragmentationReassembly
  • Types of service : The transport layer also decides the type of service that should be provided to the session layer. The service may be perfectly reliable, or may be reliable within certain tolerances or may not be reliable at all. The message may or may not be received in the order in which it was sent. The decision regarding the type of service to be provided is taken at the time when the connection is established.
  • Error Control : If reliable service is provided then error detection and error recovery operations are also performed. It provides error control mechanism on end to end basis.
  • Flow Control : A fast host cannot keep pace with a slow one. Hence, this is a mechanism to regulate the flow of information.
  • Connection Establishment / Release : The transport layer also establishes and releases the connection across the network. This requires some sort of naming mechanism so that a process on one machine can indicate with whom it wants to communicate.

Friday, May 18, 2012

Abstract Class


There are situations in which you will want to define a superclass that declares the structure of a given abstraction without providing a complete implementation of every method. That is, sometimes you will want to create a superclass that only defines a generalized form that will be shared by all of its subclasses, leaving it to each subclass to fill in the details. Such a class determines the nature of the methods that the subclasses must implement. One way this situation can occur is when a superclass is unable to create a meaningful implementation for a method. This is the case with the class Figureused in the preceding example. The definition of area( ) is simply a placeholder. It will not compute and display the area of any type of object.
As you will see as you create your own class libraries, it is not uncommon for a method to have no meaningful definition in the context of its superclass. You can handle this situation two ways. One way, as shown in the previous example, is to simply have it report a warning message. While this approach can be useful in certain situations—such as debugging—it is not usually appropriate. You may have methods which must be overridden by the subclass in order for the subclass to have any meaning. Consider the class Triangle. It has no meaning if area( ) is not defined. In this case, you want some way to ensure that a subclass does, indeed, override all necessary methods. Java's 
solution to this problem is the abstract method.
You can require that certain methods be overridden by subclasses by specifying the abstract type modifier. These methods are sometimes referred to as subclasser responsibility because they have no implementation specified in the superclass. Thus, a subclass must override them—it cannot simply use the version defined in the superclass. To declare an abstract method, use this general form:
abstract type name(parameter-list);
As you can see, no method body is present. Any class that contains one or more abstract methods must also be declared abstract. To declare a class abstract, you simply use the abstract keyword in front of the class keyword at the beginning of the class declaration. There can be no objects of an abstract class. That is, an abstract class cannot be directly instantiated with the new operator. Such objects would be useless, because an abstract class is not fully defined. Also, you cannot declare abstract constructors, or abstract static methods. Any subclass of an abstract class must either implement all of the abstract methods in the superclass, or be itself declared abstract
Here is a simple example of a class with an abstract method, followed by a class which implements that method:
// A Simple demonstration of abstract. 
abstract class A { 
abstract void callme(); 
// concrete methods are still allowed in abstract classes 
void callmetoo() { 
System.out.println("This is a concrete method."); 

}
class B extends A { 
void callme() { 
System.out.println("B's implementation of callme."); 

}
class AbstractDemo { 
public static void main(String args[]) { 
B b = new B(); 
b.callme(); 
b.callmetoo(); 

}
Notice that no objects of class are declared in the program. As mentioned, it is not possible to instantiate an abstract class. One other point: class implements a concrete method calledcallmetoo( ). This is perfectly acceptable. Abstract classes can include as much implementation as they see fit.
Although abstract classes cannot be used to instantiate objects, they can be used to create object references, because Java's approach to run-time polymorphism is implemented through the use of superclass references. Thus, it must be possible to create a reference to an abstract class so that it can be used to point to a subclass object. You will see this feature put to use in the next example. Using an abstract class, you can improve the Figure class shown earlier. Since there is no meaningful concept of area for an undefined two-dimensional figure, the following version of the program declaresarea( ) as abstract inside Figure. This, of course, means that all classes derived from Figure must override area( ).
// Using abstract methods and classes. 
abstract class Figure { 
double dim1; 
double dim2; 
Figure(double a, double b) { 
dim1 = a; 
dim2 = b; 

// area is now an abstract method 
abstract double area(); 
}
class Rectangle extends Figure { 
Rectangle(double a, double b) { 
super(a, b); 

// override area for rectangle 
double area() { 
System.out.println("Inside Area for Rectangle."); 
return dim1 * dim2; 

}
class Triangle extends Figure { 
Triangle(double a, double b) { 
super(a, b); 

// override area for right triangle 
double area() { 
System.out.println("Inside Area for Triangle."); 
return dim1 * dim2 / 2; 

}
class AbstractAreas { 
public static void main(String args[]) { 
// Figure f = new Figure(10, 10); // illegal now 
Rectangle r = new Rectangle(9, 5); 
Triangle t = new Triangle(10, 8); 
Figure figref; // this is OK, no object is created 
figref = r; 
System.out.println("Area is " + figref.area()); 
figref = t; 
System.out.println("Area is " + figref.area()); 

}
As the comment inside main( ) indicates, it is no longer possible to declare objects of type Figure, since it is now abstract. And, all subclasses of Figure must override area( ). To prove this to yourself, try creating a subclass that does not override area( ). You will receive a compile-time error.
Although it is not possible to create an object of type Figure, you can create a reference variable of type Figure. The variable figref is declared as a reference to Figure, which means that it can be used to refer to an object of any class derived from Figure. As explained, it is through superclass reference variables that overridden methods are resolved at run time.