Pages

Friday, February 11, 2011

P, NP, NP-complete, and NP-hard

What are P, NP, NP-complete, and NP-hard?

There are two classes of problems, P, and NP (there are many, many more, but we will ignore the rest here).  These stand for "Polynomial" and "Non-deterministic Polynomial".

Assuming knowledge of big-O notation, we'll move on from there.  Request an explanation and you shall have it.

Problems in P are solvable by a polynomial-time algorithm (runs in O(f(n)), where f is polynomial).  This includes things like sorting and triangulation.

Problems in NP are checkable by a polynomial-time algorithm, but not necessarily solvable.  Generally, we mean NP minus P, when we say NP (so they are only checkable).  This means that, given the solution to an instance of the problem, we can verify that it is the correct solution in polynomial time, but that we couldn't have come up with it on our own, in polynomial time.  An easy example of this is subset sum: given a set of numbers, does there exist a subset whose sum is zero?  It turns out that this is not solvable in polynomial time, but given the answer, we can easily check that it is right (the decision problem is a little harder to see, but assume you are given the actual subset).

Now, we have to discuss NP-Complete and NP-Hard, which requires a discussion of reducibility.  Reducibility is the notion that, given an instance of some hard problem, we can quickly construct an instance of some other problem, whose answer would help us solve the first problem.

The typical logic is this: you claim to have a fast algorithm to, say, solve subset sum.  I create a machine that, in polynomial time, takes a traveling salesman problem instance, and creates from it, an instance of subset sum.  If you can give my machine the answer to that question, it will, again in polynomial time, change that back into the answer for my original traveling salesman problem.  Therefore, if your subset sum algorithm is really polynomial, then I have just added some extra bits to it and created a polynomial time algorithm for solving traveling salesman problems.  If I can create such a machine, then it creates a notion of reducibility, that is, traveling salesman is reducible to subset sum, which indicates that subset sum is at least as hard as traveling salesman.  Intuitively, one might imagine that with all the universal truth out there, one could say that traveling salesman has no polynomial-time solution.  In that case, the above would be a proof by contradiction, that subset sum also has no polynomial time solution.

Now, we can pretty easily state what NP-Complete and NP-Hard are:

If two problems can be reduced to each-other, they are in some sense, "equivalent".  All problems in NP that are so equivalent, are called NP-Complete.

If there is an NP-Complete problem that can be reduced to some problem H, then H is NP-Hard.  This means that, in some sense, H is at least as hard as all the NP-Complete problems, but note that there does not need to be a reduction from H to any NP-Complete problem, so H might be harder than all the NP problems.

One proves that a problem is NP-Hard by taking an existing NP-Complete problem (3-SAT and Traveling Salesman are my favorites, since they work well for geometry), and using it to generate input to your problem, such that an answer to your problem is equivalent to solving the NP-Complete problem you chose.

Sunday, February 6, 2011

How to find invisible friends in GTalk

Your webmaster search is: how to find invisible friends in gtalk
Here is the simple trick to find the friends who are online in google talk but appearing offline.
Follow these steps to check who all are in invisible state.

  1. Open the chat window by clicking your friend’s name
  2. Click you friends name and select “Go off the Record”.
  3. Send some message to that user.
  4. if you get the feedback from gtalk that “User is offline and can’t receive messages” in red color means that the user is really offline.
  5. But if you get no response that means that the user is appearing offline and in invisible mode.

Monday, January 31, 2011

Proxy settings for apt in Ubuntu

For proxy settings in Ubuntu
create a new file called "apt.conf"  in the folder "/etc/apt/" and 
add this line to the file 
acquire::http::proxy "http://username:password@proxysereveraddress:portnumber";


ex: http::proxy "http:://mc09mt13:******@10.0.0.21:3128";

thats it now you can install any software using 



apt-get install *****

Friday, January 14, 2011

Database Indexing

Why is it needed?
When data is stored on disk based storage devices, it is stored as blocks of data. These blocks are accessed in their entirety, making them the atomic disk access operation. Disk blocks are structured in much the same way as linked lists; both contain a section for data, a pointer to the location of the next node (or block), and both need not be stored contiguously.
Due to the fact that a number of records can only be sorted on one field, we can state that searching on a field that isn’t sorted requires a Linear Search which requires N/2 block accesses, where N is the number of blocks that the table spans. If that field is a non-key field (i.e. doesn’t contain unique entries) then the entire table space must be searched at N block accesses.
Whereas with a sorted field, a Binary Search may be used, this has log2 N block accesses. Also since the data is sorted given a non-key field, the rest of the table doesn’t need to be searched for duplicate values, once a higher value is found. Thus the performance increase is substantial.
What is indexing?
Indexing is a way of sorting a number of records on multiple fields. Creating an index on a field in a table creates another data structure which holds the field value, and pointer to the record it relates to. This index structure is then sorted, allowing Binary Searches to be performed on it.
The downside to indexing is that these indexes require additional space on the disk, since the indexes are stored together in a MyISAM database, this file can quickly reach the size limits of the underlying file system if many fields within the same table are indexed.

How does it work?
Firstly, let’s outline a sample database table schema;
Field name         Data type      Size on disk
id (Primary key)   Unsigned INT   4 bytes
firstName          Char(50)       50 bytes
lastName           Char(50)       50 bytes
emailAddress       Char(100)      100 bytes
Note: char was used in place of varchar to allow for an accurate size on disk value. 
This sample database contains five million rows, and is unindexed. The performance of several queries will now be analyzed. These are a query using the id (a sorted key field) and one using the firstName (a non-key unsorted field).

Example 1
Given our sample database of r = 5,000,000 records of a fixed size giving a record length of R = 204 bytes and they are stored in a MyISAM database which is using the default block size B = 1,024 bytes. The blocking factor of the table would be bfr = (B/R) = 1024/204 = 5 records per disk block. The total number of blocks required to hold the table is N = (r/bfr) = 5000000/5 = 1,000,000 blocks.
A linear search on the id field would require an average of N/2 = 500,000 block accesses to find a value given that the id field is a key field. But since the id field is also sorted a binary search can be conducted requiring an average of log2 1000000 = 19.93 = 20 block accesses. Instantly we can see this is a drastic improvement.
Now the firstName field is neither sorted, so a binary search is impossible, nor are the values unique, and thus the table will require searching to the end for an exact N = 1,000,000 block accesses. It is this situation that indexing aims to correct.
Given that an index record contains only the indexed field and a pointer to the original record, it stands to reason that it will be smaller than the multi-field record that it points to. So the index itself requires fewer disk blocks that the original table, which therefore requires fewer block accesses to iterate through. The schema for an index on the firstName field is outlined below;
Field name        Data type   Size on disk
firstName         Char(50)    50 bytes
(record pointer)  Special     4 bytes
Note: Pointers in MySQL are 2, 3, 4 or 5 bytes in length depending on the size of the table.
Example 2
Given our sample database of r = 5,000,000 records with an index record length of R = 54 bytes and using the default block size B = 1,024 bytes. The blocking factor of the index would be bfr = (B/R) = 1024/54 = 18 records per disk block. The total number of blocks required to hold the table is N = (r/bfr) = 5000000/18 = 277,778 blocks.
Now a search using the firstName field can utilise the index to increase performance. This allows for a binary search of the index with an average of log2 277778 = 18.08 = 19 block accesses. To find the address of the actual record, which requires a further block access to read, bringing the total to 19 + 1 = 20 block accesses, a far cry from the 1,000,000 block accesses required by the non-indexed table.

When should it be used?
Given that creating an index requires additional disk space (277,778 blocks extra from the above example), and that too many indexes can cause issues arising from the file systems size limits, careful thought must be used to select the correct fields to index.
Since indexes are only used to speed up the searching for a matching field within the records, it stands to reason that indexing fields used only for output would be simply a waste of disk space and processing time when doing an insert or delete operation, and thus should be avoided. Also given the nature of a binary search, the cardinality or uniqueness of the data is important. Indexing on a field with a cardinality of 2 would split the data in half, whereas a cardinality of 1,000 would return approximately 1,000 records. With such a low cardinality the effectiveness is reduced to a linear sort, and the query optimizer will avoid using the index if the cardinality is greater than 30% of the record number, effectively making the index a waste of space.

An index can be created in a table to find data more quickly and efficiently.

The users cannot see the indexes, they are just used to speed up searches/queries.
Note: Updating a table with indexes takes more time than updating a table without (because the indexes also need an update). So you should only create indexes on columns (and tables) that will be frequently searched against.

SQL CREATE INDEX Syntax

CREATE INDEX index_name ON table_name (column_name)

Example: The SQL statement below creates an index named "PIndex" on the "LastName" column in the "Persons" table:  

CREATE INDEX PIndex ON Persons (LastName)

Tuesday, December 28, 2010

Hacking Way: Download unlimited Videos or full high quality onl...

Hacking Way: Download unlimited Videos or full high quality onl...: "you can download unlimited videos from megavideo and download into many formats.. its very easy to use... Download Now"

Wednesday, December 22, 2010

10 Major Differences Between C And JAVA

The search analytics for the blog reveals that the most popular post on the site is on the differences
between C & C++ & moreover, people have been searching for C Vs Java and Java Vs C++.
So, here are the major differences between C And JAVA.

  1. JAVA is Object-Oriented while C is procedural. Different Paradigms, that is. Most differences between the features of the two languages arise due to the use of different programming paradigms. C breaks down to functions while JAVA breaks down to Objects. C is more procedure-oriented while JAVA is data-oriented.
  2. Java is an Interpreted language while C is a compiled language. We all know what a compiler does. It takes your code & translates it into something the machine can understand-that is to say-0’s & 1’s-the machine-level code. That’s exactly what happens with our C code-it gets ‘compiled’. While with JAVA, the code is first transformed to what is called the bytecode. This bytecode is then executed by the JVM(Java Virtual Machine). For the same reason, JAVA code is more portable. 
  3. C is a low-level language while JAVA is a high-level language. C is a low-level language(difficult interpretation for the user, closer significance to the machinelevel code) while JAVA is a high-level lagunage(abstracted from the machine-level details, closer significance to the program itself).
  4. C uses the top-down approach while JAVA uses the bottom-up approach.  In C, formulating the program begins by defining the whole and then splitting them into smaller elements. JAVA(and C++ and other OOP languages) follows the bottom-up approach where the smaller elements combine together to form the whole. 
  5. Pointer go backstage in JAVA while C requires explicit handling of pointers. When it comes to JAVA, we don’t need the *’s & &’s to deal with pointers & their addressing. More formally, there is no pointer syntax required in JAVA. It does what it needs to do. While in JAVA, we do create references for objects. 
  6. The Behind-the-scenes Memory Management with JAVA & The User-Based Memory Management in C. Remember ‘malloc’ & ‘free’? Those are the library calls used in C to allocate & free chunks of memory for specific data(specified using the keyword ’sizeof’). Hence in C, the memory is managed by the user while JAVA uses a garbage collector that deletes the objects that no longer have any references to them. 
  7. JAVA supports Method Overloading while C does not support overloading at all. JAVA supports function or method overloading-that is we can have two or more functions with the same name(with certain varying parameters like return types to allow the machine to differentiate between them). That it to say, we can overload methods with the same name having different method signatures. JAVA(unlike C++), does not support Operator Overloading while C does not allow overloading at all. 
  8. Unlike C, JAVA does not support Preprocessors, & does not really them. The preprocessor directives like #include & #define, etc are considered one of the most essential elements of C programming. However, there are no preprocessors in JAVA. JAVA uses other alternatives for the preprocessors. For instance, public static final is used instead of the #define preprocessor. Java maps class names to a directory and file structure instead of the  #include used to include files in C. 
  9. The standard Input & Output Functions. Although this difference might not hold any  conceptual(intuitive) significance, but it’s maybe just the tradition. C uses the printf & scanf functions as its standard input & output while JAVA uses the System.out.print & System.in.read functions. 
  10. Exception Handling in JAVA And the errors & crashes in C. When an error occurs in a Java program it results in an exception being thrown. It can then be handled using various exception handling techniques. While in C, if there’s an error, there IS an error.

10 Major Differences Between C And C++

C++, as the name suggests is a superset of C. As a matter of fact, C++ can run most of C code
while C cannot run C++ code. Here are the 10 major differences between C++ & C…

  1. C follows the procedural programming paradigm while C++ is a multi-paradigm language(procedural as well as object oriented) In case of C, importance is given to the steps or procedure of the program while C++ focuses on the data rather than the process. Also, it is easier to implement/edit the code in case of C++ for the same reason.
  2. In case of C, the data is not secured while the data is secured(hidden) in C++ This difference is due to specific OOP features like Data Hiding which are not present in C. 
  3. C is a low-level language while C++ is a middle-level language. C is regarded as a low-level language(difficult interpretation & less user friendly) while C++ has features of both low-level(concentration on whats going on in the machine hardware) & high level languages(concentration on the program itself) & hence is regarded as a middle-level language.
  4. C uses the top-down approach while C++ uses the bottom-up approach In case of C, the program is formulated step by step, each step is processed into detail while in C++, the base elements are first formulated which then are linked together to give rise to larger systems.
  5. C is function-driven while C++ is object-driven Functions are the building blocks of a C program while objects are building blocks of a C++ program. 
  6. C++ supports function overloading while C does not Overloading means two functions having the same name in the same program. This can be done only in C++ with the help of Polymorphism(an OOP feature)
  7. We can use functions inside structures in C++ but not in C. In case of C++, functions can be used inside a structure while structures cannot contain functions in C.
  8. The NAMESPACE feature in C++ is absent in case of C. C++ uses NAMESPACE which avoid name collisions. For instance, two students enrolled in the same university cannot have the same roll number while two students in different universities might have the same roll number. The universities are two different namespace & hence contain the same roll number(identifier) but the same university(one namespace) cannot have two students with the same roll number(identifier)
  9. The standard input & output functions differ in the two languages C uses scanf & printf while C++ uses cin>> & cout<< as their respective input & output functions 
  10. C++ allows the use of reference variables while C does not Reference variables allow two variable names to point to the same memory location. We cannot use these variables in C programming.