Naive Bayes Classification using WEKA

try
{

BufferedReader trainReader = new BufferedReader(new FileReader(“train.arff”));//File with text examples
BufferedReader classifyReader = new BufferedReader(new FileReader(“test.arff”));//File with text to classify

Instances trainInsts = new Instances(trainReader);
Instances classifyInsts = new Instances(classifyReader);

trainInsts.setClassIndex(trainInsts.numAttributes() – 1);
classifyInsts.setClassIndex(classifyInsts.numAttributes() -1);

FilteredClassifier model = new FilteredClassifier();
StringToWordVector stringtowordvector = new StringToWordVector();
stringtowordvector.set_UseStoplist(true);
stringtowordvector.set_OnlyAlphabeticTokens(true);
model.setFilter(new StringToWordVector());
model.setClassifier(new NaiveBayes());
model.buildClassifier(trainInsts);
//System.out.println(model);

for (int i = 0; i < classifyInsts.numInstances(); i++)
{
classifyInsts.instance(i).setClassMissing();
double cls = model.classifyInstance(classifyInsts.instance(i));
classifyInsts.instance(i).setClassValue(cls);
}
System.out.println(classifyInsts);

}
catch (Exception o)
{

System.err.println(o.getMessage());
}

Leave a Comment

Filed under WEKA

Google Bookmarks API

XML Data Access

Description: From all the export options, this is the best one if you are trying to build a web application on top of the Google Bookmarks service.
Method: GET
Parameters:
num=a large number (if parameter is ommited, only 25 bookmarks are returned)

RSS Data Access

Description: Certain tools can handle an RSS but cannot handle any XML file.
Address: http://www.google.com/bookmarks/?output=rss

1 Comment

Filed under Google API

Print contents using javascript (*.js)

1. Javascript code, put the following code in printContent.js file

function printSelection(node) {

var content = node.innerHTML
var pwin = window.open(”, ‘print_content’, ‘width=500,height=500′);
pwin.document.open();
pwin.document.write(” + content + ”);
pwin.document.close();
pwin.focus();
pwin.print();
pwin.close();
}

2. Calling the javascript function from a *.html page on client click of a control

“printSelection(document.getElementById(‘chartRenderArea’));return false;”

here, chartRenderArea is the id of the div tag.

Leave a Comment

Filed under javascript

Adapter Pattern vs Visitor Pattern

Adapter pattern converts the interface of a class into another interface client expects. It helps two incompatible classes to work together. Adapter is useful when dealing with the legacy code especially for that code that was written a while ago and to which one might not have access and need adapting some functionality. Also, it is useful for off-the-shelf code, for toolkits, for libraries or any third party software’s.

Visitor pattern is a way of separating algorithm from the object structure upon which it operates. This helps to add new operations to existing object structure without modifying those structures.

Similarities:

  • Visitor pattern allows us to change the class structure without changing the actual class. An adapter pattern also accomplishes its task without changing the actual class.
  • Adapter and visitor pattern can be used to add functionalities to existing classes.
  • Adapter pattern and visitor pattern promotes flexibility by providing a level of indirection to another object.
  • Adapter pattern create reusable classes that cooperates with unrelated classes with incompatible interfaces. Visitor pattern also creates reusable classes with different object structure.
  • Adding or modifying operation is comparatively easy in both of these patterns.

Dissimilarities:

  • The main purpose of Adapter pattern is to make two incompatible objects work together. But, in visitor pattern the purpose is the separation of concerns.
  • Adapter pattern provides solution for composing or constructing large, complex structures that exhibit desired properties. On the other hand, Visitor pattern is most specifically concerned with communication between objects.
  • In adapter pattern, adapter doesn’t need to bother about the functionalities of the class it is adapting. But, in visitor pattern the visitor has to consider the internal functionalities of the class it is visiting.
  • Adapter pattern works in such a way that it doesn’t break the encapsulation property of OOP. In Visitor pattern, encapsulation of the composite classes is broken when the visitor is used, where the designer is forced to provide public operation that accesses the elements internal states.
  • Single class hierarchy is used in adapter pattern to adapt one interface to another whereas in visitor pattern two class hierarchies are used, one for the element being operated on and one for visitors that defined the operations on elements.

Example of adapter pattern:

Adapter pattern can be applied to an application where it needs to connect to a book provider web service to get a list of book names based on the search query term. Consider the BookServiceGateway class is the API that we need to work with. This is the Adaptee class. It has a method that returns the list of books as a generic list (in the format that is not compatible to our requirements) based on the search query term. The IBookGateway is an interface that represents the Target component and the BookGatewayAdapter class is the implementation of the Adapter. BookGatewayAdapter is the class that the client will be interacting with. It does the heavy work of resolving the incompatibility issue and acts as a bridge between the Adaptee and the client. BookGatewayAdapter will iterate through the list of books and will return the list of book names to the client.

Adapter Pattern

Adapter Pattern Example

Figure 1. Applying Adapter pattern in adapting web services APIs.

In the above example it is not possible to apply visitor pattern, because in visitor pattern it is required to know the detail codes of object structure. It is required to add call back function in the web service APIs. But in this case it is not possible to modify the web service API as because it is a 3rd party service or it can also be a DLL which is completely hidden and cannot be modified any way. So based on this it is correct to apply adapter pattern here. Also, the encapsulation is maintained accurately and two incompatible classes can work together to server the desired purpose of the client.

Visitor Pattern Example:

Let’s consider creating a reporting module in an application to make statistics about a group of customers. The statistics should be made very detailed so all the data related to the customer must be parsed. All the entities involved in this hierarchy must accept a visitor. Let’s consider CustomerGroup, Customer, Order and Item are the visitable classes. IVisitor and IVisitable are the respective interfaces for visitor and visitable. A CustomerGroup represents a group of customers, each Customer can have one or more orders and each order can have one or more Items. GeneralReport is a visitor class and implements the IVisitor interface. Other Visitors (other reports or other types of visitors) can be added by implementing the IVisitor interface

Visitor-pattern

Visitor pattern example

Figure. 2: Visitor pattern in customer report generation.

Here visitor pattern is applied accurately. The report generation is done by the visitor is dependent on the statistics object structure thereby adapter cannot be used in this situation. Here it is very easy to extend with new operations by adding a concrete visitor class. However, Adapter pattern is used to make two incompatible classes to work together. Also, adapter pattern doesn’t break the encapsulation, but in this case usage of  CustomerGroup composite breaks the encapsulation to accomplish the purpose of seperation of concerns and for better flexibility. Here visitor pattern is used to add capabilities to a composite of objects if required. Also, The polymorhic dispatching takes care of all of the decision making which is not the case in Adapetr pattern.

Leave a Comment

Filed under Software Design Patterns

Linux Commands

http://www.oreillynet.com/linux/cmd/

Leave a Comment

Filed under Linux/Unix

Migration from SQL Server 2000 to SQL Server 2008

Steps:

  1. Create structure:
    • Right click on the source database and select task àgenerate script. Select the object types you want to migrate.
    • Open the script with the destination database and run the script to clone the database structure.
  2. Store Date:
  • Right click on the source database and select task àexport data. Select the source and destination databases.
  • Select option “Copy data from one or more tables or views”.
  • Match the tables. Click on each row and hit Edit mappings button.    Check the “Enable identity insert” checkbox. Repeat for all the table you wish to migrate.

Leave a Comment

Filed under SQL Server

jQuery All in one Place

Leave a Comment

Filed under jQuery

jQuery Intellisense in VS 2008 – 3 Easy steps

To enable intellisense completion for jQuery in Visual Studio you have to follow the following three  steps:

1. Install VS 2008 SP1 or Visual Studio Web Developer Express 2008

2. Install VS 2008 Patch KB958502 to Support “-vsdoc.js” Intellisense Files

3. Download the jQuery-vsdoc.js file

  • Download jQuery-vsdoc.js here
  • Download jQuery here
  • Add these 2 files in your asp.net web application in the same directory. Please make sure both of them have the same version, i.e. jquery-1.3.2.js and jquery-1.3.2-vsdoc.js both have the version 1.3.2, the version have to be matched otherwise it will not work. Also, sometimes the vsdoc file downloaded with a 2 in the name of that file such as jquery-1.3.2-vsdoc2.js instead of jquery-1.3.2-vsdoc.js. So please make sure the file is renamed with jquery-1.3.2-vsdoc.js by removing the 2 from the file name after the vsdoc.

After these 3 steps, add <script src=”assets/js/jquery-1.3.2.js” type=”text/javascript”></script> this tag to the header of your asp.net page. And you will see the intellisense working in your page.

If you want intellisense in a seperate java script file, you have to add /// <reference path=”jquery-1.3.2-vsdoc.js” /> this line at the very beginning of your java script file.

2 Comments

Filed under jQuery

Glossary of SEO terms

Alt Tag: The alternative text that the browser displays when the surfer  does not want to or cannot see the pictures present in a web page. Using alt tags containing keywords can improve the search engine ranking of the page for those keywords.

Click Popularity: A measure of the relevance of sites obtained by noting which sites are clicked on most and how much time users spend in each site.

Directory: A site containing links to  other sites which are organized into various categories. Examples of directories are Yahoo! & Open Directory.

Doorway Page: A page which has been specially created in order to get a high ranking in the search engines. Also called gateway page, bridge page, entry page etc.

Dynamic Content: Information in web pages which changes automatically, based on database or user information. Search engines will index dynamic content in the same way as static content unless the URL includes a ? mark. However, if the URL does include a ? mark, many search engines will ignore the URL.

Frames: An HTML technique allowing web site designers to display two or more pages in the same browser window. Many search engines do not index framed web pages properly – they only index the text present in the NOFRAMES tag. Unless a web page which uses frames contains relevant content in the NOFRAMES tag, it is  unlikely to get a high ranking in those search engines.

Hallway Page: A page containing links to various doorway pages.

Heading  Tags: A paragraph style that is displayed in a large, bold typeface. Having text containing keywords in the Heading Tags can improve the search engine ranking of a page for those keywords.

Hidden Text: Text that is visible to the search engines but is invisible to humans. It is mainly accomplished by using text in the same color as the background color of the page. It is primarily used for the purpose of including extra keywords in the page without distorting the aesthetics of the page. Most search engines penalize web sites which use such hidden text.

Image Map: An image containing one or more invisible regions which are linked to other pages. If the image map is defined as a separate file,  the search engines may not be able to index the pages to which that image map links. The way out  is to have text hyperlinks to those pages in addition to the links from the image map. However, image maps defined within the same web page will generally not prevent search engines from indexing the other pages.

Inktomi: A database of sites used by many of the larger search engines like HotBot, MSN etc. For more information, see http://www.inktomi.com

JavaScript: A scripting language commonly used in web pages. Most search engines are unable to index these scripts properly.

Keyword: A word or phrase that you  type in when you are searching for information in the search engines.

Keyword Weight: Denotes the number of times a keyword appears in a page as a percentage of all the other words in the page. In general, higher the weight of a particular keyword in a page, higher will be the search engine ranking of the page for that keyword. However, repeating a keyword too often in order to increase its weight can cause the page to be penalized by the search
engines.

Link Popularity: The number of sites which link to a particular site. Many search engines use link popularity as a factor in determining the search engine ranking of a web site.

Meta Description Tag: The tag present in the header of a web page which is used to provide a short description of the contents of the page. Some search engines will display the text present in the Meta Description Tag when the page appears in the results of a search. Including keywords in the Meta Description Tag can improve the  search engine ranking  of a page for those keywords. However, some search engines ignore the Meta Description Tag.

Meta Keywords Tag: The tag present in the header of a web page which is used to provide alternative words for the words used in the  body of the page. The Meta Keywords Tag is becoming less and less important  in influencing the search engine ranking of a page. Some
search engines ignore the Meta Keywords tag.

Pay Per Click Search Engine: A search engine in which the ranking of your site is determined by the amount you are paying for each click from that search engine to your site. Examples of pay per click search engines are Overture, HootingOwl etc.

Robot: In the context of search engine ranking, it implies the same thing as Spider. In a different context, it is also used to indicate software which visits web sites and collects email addresses to be used for sending unsolicited bulk email.

Robots.txt: A text file present in the root directory of a site which is used to control which pages are indexed by a robot. Only robots which comply with the  Robots Exclusion Standard will follow the instructions contained in this file.

Search Engine: A software that searches for information and returns sites which provide that information. Examples of search engines are AltaVista, Google, Hotbot etc.

Search Engine Placement: The practice of trying to ensure that a web site obtains a high rank in the search engines. Also called search engine positioning, search engine optimization etc.

Spamdexing: See Spamming.

Spamming: Using any search engine ranking technique which causes a degradation in the quality of the results produced by the search engines. Examples of spamming include excessive repetition of a keyword in a page, optimizing a page for a keyword which is unrelated to the contents of the site, using invisible text, etc. Most search engines will penalize a page which uses spamming. Also called spamdexing. In a different context, spamming  is also used to mean the practice of sending unsolicited bulk email.

Spider: A software that visits web sites and indexes the pages present in those sites. Search engines use spiders to build up  their databases. Example: The  spider for AltaVista is called Scooter.

Stop Word: A word that often appears in pages, yet has no significance by  itself. Most search engines ignore stop words while searching. Example of stop words are: and, the, of etc.

Title Tag: The contents of the Title tag is generally displayed by the browser at the top of the browser window. The search engines use the Title tag to provide a link to the sites which match the query made by the user. Having keywords in the Title tag of a page can significantly increase the search engine ranking of the page for those keywords.

Leave a Comment

Filed under Technical

LINQ Basics

LINQ-enabled languages provides

  • full type-safety
  • compile-time checking of query expressions
  • development tools can provide full intellisense, debugging, and rich refactoring support when writing LINQ code.
  • fully supports transactions, views, and stored procedures.
  • integration of data validation and business logic rules into your data model.

Leave a Comment

Filed under Technical