Premanandh_j: 2012-07-15

Saturday, 21 July 2012

RELATING OOPS, DATA STRUCTURE ,DBMS ,WEB

               This blog tries to relate the topics that are being blogged about. So lets try to relate all these things to get the big picture of them all.

                Lets take a journey from the web services and we can move through the rest of the topics.Consider a user accessing a web page to download a picture. The web services starts from a client request to the server. So lets guess how this works.
             The client sends a message which reaches the server. There can be many paths from client to server and the client must choose the optimal path. This is done by using path finding algorithms. The information will be stored in the networking devices. Proper data structures should be used to give the optimal path to the request from the client.
             The web page from which the request starts will contain many components. Each component will have particular attributes and behavior. This can be easily established using Object Oriented Programming Languages. Here each component will be represented as an object.
           Once the request reaches the server the web server will get the request and will forward it to the application server. This application server will be used to get the picture.
           The picture will be stored in a database. The DBMS will abstract the application programs from the manner in which the data is stored in a database. There will be different data structures which will be used to retrieve the data efficiently. The most preferred data structures used here are indexes and trees(B+).
   Database concentrates more on the data while DS concentrates more about organizing the data.
          Anything which is getting stored will require a data structure. Hence the data structure will be an essential in a programming language like OOPS. The OOPs is related to RDBMS. Because the master tables can be represented by using the objects. There also exists OODBMS which purely depends on the Object orientation.

Friday, 20 July 2012

INSERTION AND RETRIEVE DATA

INSERTION:
        *) The insert command works with the syntax,
        insert into <table_name> values(.....);

        *) There can also be bulk insert where you can insert multiple values into the database in a single query.
         insert into <table_name> values (field1,field2,....) values
                           (value11,value12,....) , (value21,value22,....) ,...;

   *) There is also an option of inserting data from other tables using the following command.
   insert into table1 select * from table2;

   *) You can also insert particular values into the table1 from table2,
insert into table1 select id,name,age from table2;

   *) If you need to insert data from another table you can do it by mentioning the database name of the table before the table name.
          insert into db1.table select * from db2.table2;

RETRIEVAL OF DATA:
        The retrieval of data is done using select command. The where command is used to set the conditions for selecting the columns. Now let us discuss those commands.
        select <columns> from <tables> where <conditions>;
        To select all the columns use the ' * ' symbol. To select the tables list the tables. In the conditions, you can use '=' operator to check for equality, '>' greater than, '<' less than.
        To compare the results of various operators we can have AND and OR conditions attached to them.
        ORDER BY is a clause used to order the rows retrieved based on the columns specified on the ORDER BY clause.
        GROUP BY is a clause used to group the rows containing the columns into a single row.
   The following functions are usually used along with GROUP BY clause and they can be used independently also.
         *) COUNT function is used to return the number of values of a particular column.
         *) SUM function is used to add the values of a particular column.
         *) AVG function is used to find average values of a particular column.
   *) MAX function is used to find the maximum in a particular column.
   *) MIN function is used to find the minimum in a particular column.
   The LIKE operator is used in the following manner,
         SELECT column_name(s) FROM table_name WHERE column_name LIKE pattern.
   The pattern can have a '%' symbol either in the beginning or in the end or both the ends. The '%' symbol will be substituted by one or many characters.

Thursday, 19 July 2012

MODIFYING TABLE

Modifying Table :
                     Modifying table is modifying the schema or contents of the table. The schema can be altered by using ALTER,UPDATE,DELETE command. Modifying involves,

ALTER:
        ALTER command changes the schema of the tables.
        1) Altering the data type or adding new constraints or changing names of
            column ,
            ALTER TABLE <table_name> CHANGE COLUMN <column_name>
            <new_column_name> data_type constraints ;

            For_example : If i want to change the car_id from largeint to int with
             AUTO INCREMENT constraint we can do this by,
             ALTER TABLE cars CHANGE COLUMN car_id car_id INT
             PRIMARY_KEY AUTO INCREMENT;

        2) We can also change the column name ,
             ALTER TABLE cars ADD <column_name> AFTER/BEFORE
   <column_name>

   For_example : If we want to add car_color column after the car_id we
             can add it using,
                ALTER TABLE cars ADD car_color after car_id;

UPDATE:
              Update table is used to change the values inside the table. This is done by selecting particular rows and changing the attributes in those rows.
            UPDATE TABLE <table_name> set <column_name1>=value WHERE
             <column_name1>=value1;
             For_example : If we want to change the color of car having id=102 to
             'Yellow'
             UPDATE TABLE cars set car_color='Yellow' WHERE
             car_id=102;
DELETE:
         DELETE is used to delete either selected records from the table or the entire records in a table. The DELETE command can be used in either way as follows
             DELETE FROM <table_name> where.....
             The above mentioned is used to delete the selected records depending on the where clause.
              DELETE * FROM <table_name>
              The above mentioned will delete all the columns from the table.

TRUNCATE:
               TRUNCATE is used to delete the entire contents from the table. The syntax is,
                TRUNCATE TABLE <table_name>

TRUNCATE VS DELETE:
                Applying DELETE for all records in a table and TRUNCATE may seem similar. They give the same output when they execute. But there is a difference between the both. DELETE logs before deleting each record. DELETE is costlier as it involves resources like locking and logging. But DELETE is more safe to apply than TRUNCATE.

DROP:
             DROP differ from TRUNCATE and DELETE because DROP deletes the schema along with the entries. DROP is applied either to a column or a table. When the schema is deleted all the integrity constraints, access privileges will be deleted.
          Lets quote two examples to differentiate between DROP and TRUNCATE.
           eg.1: TRUNCATE TABLE CAR;
                    SELECT count(*) from car;
                     output :
                      count(*)
                     --------------
                        0
          e.g.2: TRUNCATE TABLE CAR;
                    SELECT count(*) from car;
                     output :
                     ERROR at line 1: ORA-00942: table or view does not exist.

Creating a Table and the commands to work on it..

Here we will be dealing about the various commands to work with in the DBMS. The DBMS which we are using here is MYSQL.

Creating and Deleting a Database :
                 The DBMS can contain any number of databases. Here we can consider a database for cars. So how will you create one...?
         CREATE DATABASE [ IF NOT EXISTS ] cars ;
This command helps in creating a database called cars. the IF NOT EXISTS is used to check the database whether another database by the same name exists. This is optional but helps in preventing errors in-case another database by the same name exists. To use this database we should use the command,
                   USE cars ;
   This helps in fetching the database cars and then we can create tables in the database and work on it.

To delete a database from the DBMS we can use the command,

   DROP DATABASE [ IF EXISTS ] cars ;
   The IF EXISTS is again an optional command where you can check in prior whether the database already exists or not. DROP is a risky command to use as it deletes the database along with the entire tables inside the database.

Creating the tables :
   Once you enter into the database you can create tables inside it. You need to be clear about various things before you create a table. We will discuss various things in detail before we know about the syntax of create tables.

a ) Data Types .

                 Data Types offer a lot of data types to work on. Each column will contain a particular data type. The data types can be NUMERIC, STRING, DATE AND TIME, SPATIAL. Picking a data type for a column determines on
            1) How do you want to represent the data
            2) Whether it is a fixed length or variable length.
            3) If there is a need to index the data
4) How MySQL will compare the values in the data types.

b) Storage Engines .
             MySQL offers various types of storage engines. The type of storage engines which are offered innoDB, MyISAM, BDB, HEAP, MERGE. The default engine which will be used is MyISAM. It provides full support to indexing but it is not transaction safe. Only innoDB and BDB are transaction safe.
   MyISAM is used when you need fast access of data without any need to concentrate on foreign key constraints. InnoDB supports foreign key constraints.
We can change the table type(engine) on your need.
   InnoDB vs MyISAM :

InnoDB provides referential integrity(foreign key constraints)
InnoDB provides row level locking. Hence it is used in write intensive applications.
MyISAM is simple and hence it is generally preferred for read intensive applications.
Genrally MyISAM is faster as it uses full-text indexing.

c) Identifying Primary key and Foreign key.
              Primary key is the key which makes it possible to uniquely identify a record in a table. The primary key can be an attribute(column) or a set of attributes(columns).
        Example: A car database can contain a car table. The car table needs a attribute to differentiate between two records. This can be achieved using having a car_id. This id should be unique and NOT_NULL.
              Usually a table should contain primary key. If you don't have any primary key it wont be an error. Languages like Microsoft Access will maintain a default Primary key with an auto number data type.

               Foreign keys are primary keys of another table. Foreign key comes to use to maintain relationships between tables. Foreign keys are used to cross reference between two tables.

               Example: Coming back to the car database. There can be a manufacturer table which contains a manufacturer_id as primary key in it. Each car will have a manufacturer associated with it. This relation can be maintained by having a new column in car table known as manufacturer_id. Now, it will be easy to know the manufacturer of the car and also no car can contain unregistered manufacturer.

d) Identifying Nullable and NOT NULL fields in database :
                     There are certain situations when you don't need to specify certain fields and there are certain attributes which should not be left unfilled. For example a car may have a owner or may not have a owner. So it is not always needed to fill this attribute of the record. But each car should have tires so the tire_company_id should be filled and should not be left unfilled.
                       We can also have a Default field which allows a record to take a default value in-case it wasn't filled during insertion.

e) Identify exact data type for fields :
   This is an important aspect in creating the table. Now, lets come back to car table. Here car_id can be a largeint as there is a need to store large number of cars and each record should have a unique identifier. The car_model indicating the model of the car should be a string data type. In that it can be a a varchar(10) - 10 letter characters. Because no model of a car can be more than 10 letters. So having some common sense and practical knowledge will let you pick the data types and fields.

f) Naming Conventions for Column Names:
The basic things to follow for naming columns are
                1) To check whether the naming makes sense to you,
                2) To check if the naming makes sense to others,
                The first point is because if you see your code code few months later you may get confused. The second is because when many people are working with you the naming should be clear to both and the other people who uses it.
The preferred way of naming are using singular words and preventing mentioning the table names as a prefix of the attribute.

e) How to determine which field to index :
                        Creating index will slow down the entire database as there is a need to maintain index and there is overheads involved in creating and deleting the index. Generally primary fields are indexed but non primary fields are sometimes indexed too.
                       Index helps in improving the search if the value searched is same as the value required.
              Example: If the car_id is indexed queries requiring the car_id's can be executed fast. But if there is a query regarding car_id and the model_name then the index wont be useful.
                     Create index by keeping in mind the query performance and the maintenance of the records.

Database Management System

What is DBMS :
DBMS is a software that helps in easy retrieval and managing the database. A database is a storage structure which contains tables. The tables will have records and attributes. Records are the row in the tables and Attributes are the column in the tables.

DBMS VS FILES :
Files are common data storage where the data is stored in individual files. The files can have several formats. The DBMS came to replace the file systems. They have the following advantages over the DBMS. They are,

There is a centralized software which runs over the entire database and this helps managing the entire information easily.
This provides abstraction between the application programs and the way the data is stored and organized.
Integrity of data is established. You can verify that the integrity constraints are met easily. For example, assigning a new employee to meet a customer before accommodating a new customer.
Consistency of data is maintained as there are recovery methods to recover from any failure.
Concurrent access are permitted so that a user may feel that he is the only person accessing the data.
Redundancy of data is avoided.
Flexibility is there as you can add new constraints to the database.
There is a problem of incompatible files when files are used.

   Internally the DBMS uses files to store the data in the hard disk. But it helps to programmer from caring about the way in which the data is stored and gives an environment to concentrate on the application development.

DBMS VS RDBMS :
              RDBMS is called the Relational Database Management. Here there exists relation between the tables. Relations can be one-to-one ,one-to-many ,many-to-many.
              RDBMS is a DBMS. But it adds more constraint in maintaining the relationships between tables.

What is a relationship in a RDBMS ?
   People usually come to know that RDBMS maintains relationships in database. So how is it maintained...?
         If there is a one-to-one or one-to-many relationship, it can be maintained by using a foreign key in the child table.
   In case of many to many relationships there needs an external relationship table to establish a relationship between the two tables.
         The relationship table will contain the primary keys of both table mapped against each other. This relationship table is used because to avoid redundancy of data in the child table.

Sunday, 15 July 2012

Data Structures

The Audio blog for this is in here

Data Structures:
                   Data Structures are used in computers to store and retrieve data data. Data structures are used to organize data so that they will meet the requirements of the applications. Data type and Data Structures are similar. Data type refers a well defined format of storing data and applying operations over the data. Data Structures are implementations of abstract data types.
           Some examples of data structures are Stack, Queue, Hash, Trees etc., By abstract data type we mean that , Stack is an abstract data type and it can have operations such as push(x), pop(x), is_empty(). But the implementation is programmer defined.
                Class Stack {
   Data_type data;
   Date_type pop()
                   { }
                   void push(Data_type x)
                   {}
                    bool Isempty()
                    {}
                 }
              The above mentioned stack is just an abstract class and the implementation can use either an array or linked list. For example ,
               Class array_stack : public Stack{
Stack()
   {a = new (size of Data_type * N);,i=0;}
   Data_type pop()
   {
return a[i--];
                  }
                 void push(Data_type x)
                 {
a[++i] = x;
                  }
                  bool Isempty()
                 {if (i>=0) return false; else return true;}
               }
Time and Space Complexity:
        Time complexity is a measure of the time taken for a particular code to execute. It is usually represented using a 'O' notation. Here we will use the term n- no of inputs. Generally the time taken for order of execution in ascending order is 1, log n, n, nlog n, n^2,.... The space complexity is represented in the same manner. It is the space required for computing the given problem.

Uses of data Structures :
                Data structures are used based on the particular application.
When you want to retrieve the max of the elements you are storing you can go for max heap the inverse applies for min heap. Queues can be used when there is a particular resource and many entities need the same resource. Then the queue can be used to order the entities based on the time of arrival.
                Array is a simple method of storing data where you can access the data easily if you know the index where it is stored. The uses of data structures are very large and it is not easy to list them all within the space.

Which data structure is better ?
                No two data structures can be compared. Each one is best suitable for the particular. Comparing two data structure is like comparing a world class piano artist with a world heavy weight champion. Sounds dumb right.!! If you wanna hear good music you can watch the artist performance. To watch a gr8 fight go for the boxer.
             So the point i am trying to make hear is that the data structures which you choose should best choose your application. If you want retrieval of data instantly go for hash. If you want to get the extremes of the data go for heaps.

How to arrive at a particular data structure ?
           Having a good data structure for a problem depends on the ability to map a data structure to the given problem. Or else you need to check each data structure until you come up with the best one.
   Consider the given problem, if there is a string containing braces '{' and '}'. A string is valid if '{' contains a '}' pair following it later in the string. You need to find valid if the string is valid or not.
   So what data structure you will use..????
   A array is fine but it requires an extra storage and the execution time will be O(n^2). A hash map can be used by mapping the open braces to a list containing their position. So when a closed brace comes check for a preceding open brace in the list and delete it from the hash map. Stop the process until the string ends or you get a unmatched close brace.
        A stack will be the good choice as it doesn't require extra memory and the execution time is O(n). So the best possible solution here is a stack.
        Sometimes a single data structure cannot give a solution and more than one DS need to be coupled to get the output efficiently.

List:
        List are dynamic data structures where you can allocate memory and remove the memory of the list members during the runtime of the program. Each list has a self referential pointer and you need the pointer to move through the list. The list is not a preferable DS if you need to access elements faster. And the pointer manipulations are somewhat tricky so the implementation requires caution.
       Lists are categorized as singly linked list, Doubly linked list, circular linked list based on the link structure.

Sorting :
         Sorting is to arrange the elements present in a ordered way. Programmers have developed a number of sorting methods. The best sorting methods offer a running time complexity of O(nlog n ). They are merge and quick sorts.
           Both these sorts come under divide and conquer methods. By divide and conquer we divide the bigger problem, solve those subproblems and recursively you will end up in solving the problem.
           For example, If you need to make 500 students in height order. Split the student into groups of 20, order them and then joining the ordered groups. It will be better than picking the next taller student each time from the group of 500.

Searching :
        Searching is searching for a particular data in the Data Structure. Hash offers a better search having a time complexity of O( 1 ). DFS(Depth First Search) is a method of searching the data. The algorithm for the DFS is,

          Get the Root
Push root into the Stack
          While (The Stack is not empty)
                 Pop stack top into temp
                 if (Temp contains the data)
                      return the data
                 push the children of the temp into the stack

   This can be done using recursion also. But it is better to present it this way as recursion may lead to stack overflow.
     For Sorted data in a array you can apply binary search.

Premanandh_j