November 22, 2007
Last Blog Entry
When I first started this class, I only knew a handful of tricks for searching information on the internet. Most of my searches were done by using the Yahoo Web search engine. My queries were generally simple and broad. I wasted valuable time weeding through the search results, typically looking past three pages of results. If I wanted information on a certain topic, I would enter the topic name and add the word “wikipedia” after it. If lucky, there would be a Wikipedia article on the topic and I would get the information that way. I also extensively used quotation marks. For example, if I needed to find lyrics for a certain song, I would put a section of the lyrics in quotes and hope that the artist I was looking for would show up.
As it turned out, I learned a great deal about using specific queries with full-text search engines. I rarely used Google before this class, but now I use it much more often. Learning about the search syntaxes drastically increased the number of relevant results on the first page of search results. The various principles presented by Web Search Garage were valuable such as the Principle of Onions and the Principle of Nicknames. Perhaps the most informative chapter of Web Search Garage was the first chapter where it talked about the different types of syntaxes that could be used with Yahoo Web and Google.
One major topic that I did not understand too well going into the class was RSS feeds. These turned out to be very helpful in searching and trapping information. In the long run, they saved some of my time spent typing in website URLs that I visited daily by aggregating all the information into one source. I found Bloglines to be particularly useful for both keeping track of RSS feeds and searching for blogs. I felt that Bloglines could have been emphasized more. It was a very clean website for managing RSS feeds as opposed to Technorati which I found to be a sub-par service based on my own personal experience. Also, I had never even thought about creating my own RSS feeds, thus, Feed43 and Dapper came as an innovative shock to me.
Even though I did not use page monitors or e-mail alerts too often, they are still useful in trapping information. I had no idea what either one was before taking BIT 330. In the end, I just found RSS feeds to be the most convenient. I abused RSS feeds way more than page monitors and e-mail alerts. However, if I had to use a page monitor or an e-mail alert service, I would choose WatchThatPage or GoogleAlert (third party), respectively. These are both services that I would have otherwise not known about unless I took this class.
I will definitely use the various syntaxes that I learned in this class. More specifically, I will certainly remember chapters 1 and 12 from Web Search Garage. These chapters showed me how to use the search query syntaxes. The ones that I used most often were site:, inurl:, and intitle:. I will also continue to use Bloglines as my RSS feed aggregator for indexing topics that interest me as well as the websites that I visit daily. Lastly, I am addicted to flickr. I am a loyal Yahoo fan, but I was unaware that they owned flickr. This website turned out to be very a useful tool for searching images with tags. I used flickr to find pictures for my marketing group presentation and it proved to be just as, if not, more helpful than Google and Yahoo’s image searches. Furthermore, it is almost as addictive as Facebook (almost!).
Overall, I felt that BIT 330 was very helpful in teaching students about the basics of effectively using the internet for searching and trapping information. I believe that it can be improved in a few ways. Perhaps by requiring students to do only one of the two wikis for their term project might make receiving information on their topics less hectic. However, this might make the class too easy, so I would also recommend making the search tool comparison report more intensive and challenging. This can be achieved by making students search more than one query and including more web search and blog search engines. Also, I found that there was a decent amount of overlap between Tara’s two books. In my opinion, I thought Web Search Garage was generally more useful and easier to read but the RSS aspect of Information Trapping made it a fundamental part of the class as well. Because these books are outdated by a few years, maybe some new materials for the class wouldn’t be such a bad idea.
In the end, I really enjoyed the course. I will be able to take away the tools and skills that I picked up in this class and apply them to my everyday life such as in other classes or even in my future jobs and career. I will be sure to recommend BIT 330 to aspiring junior and senior business school information trappers.
Thank you Professor Moore for hosting BIT 330. It has been a blast!
Posted by mjiao at 01:58 AM | Comments (0)
October 31, 2007
RSS Feeds
At the end of the Page Monitor Lab lecture, I learned how to use WatchThatPage, Dapper, and Feed43. I used Dapper to set up two useful RSS feeds for my business term project. Dapper is a user-friendly service that can create RSS feeds for websites that don’t normally have RSS. It is able to format in XML, HTML, and a variety of other languages.
I wanted information on Apple’s financial news, so I entered “apple finance” in the search query box. The first result that came up was “News for Apple Inc. - Google Finance.” After clicking on this search result, I was able to select the RSS feed format. After getting a short URL at the bottom of the page, I entered this into my Bloglines account.
From this feed, I learned that Apple has limited sales to two iPhones per customer and banned cash sales. Apple is trying to keep buyers from unlocking and reselling the phones for a profit. There were also many search results on Apple selling two million copies of its new operating system called Leopard. Shortly after those results, a search result said that Apple’s shares reached a record high after the announcement of the sales.
This feed has been very successful so far in returning financial-related news for Apple. It covers general information on Apple’s news on the web from Google Finance. The only criticism I have of this feed is that many results were duplicated. For example, “Apple says sells two million copies of latest Mac OS” came up at least 10 times. Also, “Apple limits iPhone buys to 2 per person, bans cash sales” came up at least 10 times too.
I also wanted general information on Apple’s news, so I entered “apple news” in the search query box. I saw two results that I liked, one being “Apple Hot News” and the other being “Apple Hot News Version 2.” Judging by the appearance of the two results in the preview box on the left, I chose the second version; it was more aesthetically pleasing. Like with the first feed that I generated, after clicking on this search result, I came up with a short URL and entered it into my Bloglines account.
From this feed, I learned that Apple announced the release date for the new Mac OS on October 16th. Also, on October 22nd, Apple reported fourth quarter results which showed growths in gross margin, sales, etc.; in short, they had a very good year. Many results were reviews on the new Leopard operating system. For example, one talked about how Leopard is the most polished and easiest operating system to use. Another one states how using Leopard makes using a Mac more productive and fun.
This feed has also been successful so far in returning general news from Apple. It covers updated information on the Hot News page of Apple’s official website. Similar to the other feed I made, this feed also had some duplicate results but not as much. Furthermore, there were a couple of duplicate results between the two separate feeds. One of them was “Two Million Copies of Leopard Sold in First Weekend.”
Dapper and Bloglines is a great duo for making/finding RSS feeds and keeping track of them. RSS feeds are by far the most useful resource that I have come across in this class. The material was only lightly covered in BIT 200 but I’m glad that this course went in-depth into how they work and how we can use them. RSS feeds are a valuable asset as to how I gather information, whether for a personal topic I’m interested in or for a company assignment that I need to research in the future.
Posted by mjiao at 11:48 AM | Comments (0)
October 28, 2007
E-Mail Alerts
At the end of the Web Search Principles 3 lecture, I set up two e-mail alerts: one with Google Alerts and the other with GoogleAlert. While Google Alerts is a free service officially provided by Google, GoogleAlert is a third-party service that also provides a good (non-free) way of monitoring the web; GoogleAlert is not affiliated with Google.
In Google Alerts, I tracked the search query “2008 new bmw m3” comprehensively and received an e-mail alert everyday. The format of the e-mail looked like Google results. Results were divided by Google News Alert, Google Blogs Alert, and Google Web Alert. Sometimes, there were no results in some of the sections. For example, on October 5th, I only received Google Blogs Alert results and none from the other two categories.
Google Alerts returned some results that were not what I had wanted. Some results in the Google Blogs Alert section were completely irrelevant. For example, on October 13th, the Blogs section had a link to a blog by luxuerds98961 who put a bunch of BMW-related words together to draw attention. After clicking the link, it led to a login page on Blogger. I did not sign in because it seemed shady.
Most of the relevant results were found in the Google Web Alert section. I found some of the most interesting links here. This was where I found an exciting video of a drag race between a Bugatti Veyron and the new BMW M3 coupe. Of course, the million dollar Veyron spanked the M3, but it was still intriguing nonetheless. This was where I also found out that BMW was making a sedan version of their new M3. It will have four doors instead of two, making the M3 a little more practical (for people with families).
Although some of the results were irrelevant to my search query, they were still relevant to my personal business topic. I found information on the upcoming BMW 135i. It’s a small car that shares the same engine as the 335i which generates over 300 horsepower. I also found some information on a BMW concept called the tii. Apparently, it’s supposed to be the M version of the 1-series; however, they can’t call it the M1 because BMW actually made a car called the M1 a long time ago.
In GoogleAlert, I tracked the search query “2008 new bmw m3.” The top 50 results were tracked and searched everyday. Instead of receiving e-mails with the search results, I was able to browse the results by logging into their webpage. On the left-hand side of this page, I could do things such as select which date to look at and change the view from normal to frames. With frames, I was able to visit web sites without leaving GoogleAlert. There was also a summary of results on the left side under the viewing options.
Some of the results that I found in GoogleAlert were also found in Google Alerts. For example, an article about the dutch fuzz getting a new BMW M3 coupe came up in both search results.
I was impressed, to say the least, about the incredible relevancy of the results from GoogleAlert. Surprisingly, they were more relevant than the Google Alerts results. Almost all of the search results pertained to the new BMW M3. For example, I found a link about the new M3 being optioned with a better gearbox from GoogleAlert that I did not see in my Google Alerts results.
If I had to choose between Google Alerts and GoogleAlert, I would go with the latter. It provides an amiable interface and the search results were, in my opinion, much better than Google’s. Additionally, GoogleAlert had no advertisements whereas Google Alerts had sponsored links in almost all of the e-mails they sent. The only downside to GoogleAlert is that in order to receive more than 50 results, create more than 3 searches, and access advanced features, you must pay a fee. They range from $4.95 to $39.95 each month for the different types of account services.
Overall, I found that receiving e-mail alerts everyday is a bit of a hassle. They take up a lot of room in my inbox and they all have the same subject line: “Google Alerts.” Sometimes, I would not check my GMail account for a few days and it would be piled with e-mail alerts. I prefer to receive my information on a weekly basis, or search for information that I want by myself.
Posted by mjiao at 04:35 PM | Comments (0)
October 24, 2007
Page Monitoring Tools
Today in class, we learned how to use WatchThatPage, Dapper, and Feed43. We also went through a very short tutorial on how to use Yahoo Pipes. I found the majority of these tools to be helpful in gathering information.
WatchThatPage is an ancient e-mail notification website. I said ancient because it looks like an older website with clickable-text buttons and plain colors. According to the news on the right side of the page, they don’t update often anymore, probably because of the popularity of RSS feeds and other convenient ways of receiving information. I simply signed up, edited my profile to send me updates at 8:00 AM everyday, and started adding URLs. It took me a while to find informative pages on my topic that didn’t provide RSS feeds. This was probably because most of the newer pages with updated information offered an RSS service.
Dapper is user-friendly and uncomplicated. It is used to create RSS feeds for websites that don’t provide an RSS service. It can also format in XML, HTML, and a variety of other languages. I made a feed for Google Finance News and defined Apple’s stock as the input. I also searched for content on Dapper and got two pre-made RSS feeds; one was for a different Google Finance News and the other was for the official Apple News page. Getting a URL at the bottom of the page was the last step. From there, importing these simple URLs into Bloglines was a breeze.
Feed43 seemed to be complicated. Users need to understand what {%} and {*} mean and are also expected to have a basic understanding of the HTML language. Other than that, it is a very useful tool because you can customize your RSS feeds to display exactly what information you want from a website (as opposed to Dapper retrieving the whole website). The process was simple but not clear-cut. Reading the source code in step 1, defining the extraction rules in step 2, and defining the format in step 3 could be confusing for people who do not have a lot of experience with XML and HTML. Feed43 alleviated this issue by putting [?]’s at each of the steps. Clicking on a [?] will open a popup, explaining the step in further detail. Being able to manipulate the size of the text boxes with (-) and (+) was also very helpful.
Yahoo Pipes is an interactive RSS editor with a graphical user interface. Andrew Muench made a video tutorial for it with his Mac and put it on YouTube. At first glance, Yahoo Pipes reminded me of Yahoo PageBuilder for building web pages on Geocities. In actuality, it’s an interactive RSS feed builder, meaning that you can create sources, user inputs, operators, etc. to put together a congregation of feeds from different URLs into a single RSS feed. This could be very helpful if you are searching for lots of information on a topic that’s updated often. For example, there are many RSS feeds about the Apple iPhone and someone may want to put all of those feeds together instead of adding each one independently in Bloglines.
Each of these websites has their respective strengths and weaknesses. WatchThatPage isn’t updated often anymore but it still provides a service that most people use today. Feed43 is complicated but once learned, it can be a useful RSS feed generation tool. Yahoo Pipes takes time to learn as well, but like Feed43, it can become a powerful asset once mastered. In my opinion, the only tool that was both aesthetically pleasing and easy to use was Dapper. I plan on using Dapper and becoming more familiar with Yahoo Pipes throughout the rest of the semester.
Posted by mjiao at 05:10 PM | Comments (0)
September 26, 2007
Technorati vs. Google Blog Search vs. Bloglines
Because my personal term project is about new cars, I searched for ““future cars” bmw” on Technorati, Google Blog Search, and Bloglines. Technorati returned 72 results with authority set at a lot of authority, Google Blog Search returned 1,787 results, and Bloglines returned 161 results. I was able to set Google Blog Search and Bloglines to display 50 results per page, but I could not do this with Technorati. Also, when searching with Technorati, quotation marks did not specify anything, so I had to input the search query in the advanced search section.
Out of all these blog searches, I found that Technorati returned the most garbage, even with authority turned on at its highest setting. For example, one of the search result titles was “adult membership site hosting.” There was a bunch of irrelevant words tied together in the search description. This result came up most likely because the word “car” was listed in the description. This was completely irrelevant to my search query. To add further shock value, it had an authority rating of 5.
Google Blog Search returned almost all relevant results. It has the useful feature of weeding out results published in the last hour, 12 hours, day, week, or month. It shows results published anytime by default. You can also specify your own dates. The results can be sorted by relevance or date. Like Technorati, you can subscribe to the search results.
In my opinion, I felt that Bloglines returned the most useful sites and posts. There were a minimul amount of irrelevant search results. All results dealt with information regarding new/future/concept cars and BMW cars. There were, however, repeats within the search results. For example “Spy Shots: BMW 6-Series” was repeated 9 times, “Spy Shots: 2008 BMW 7-Series” was repeated 4 times, and “Chris Bangle Great cars are Art” was repeated 3 times. There were also a couple of duplicate search results as well. You could conveniently preview results and subscribe to the feeds on the spot without having to visit the sites. Search results can be sorted by relevance, date, or popularity.
Furthermore, Bloglines also returned useful posts that none of the other two websites had in the first 50 or so results. Some examples include posts about spy shots and information about the upcoming 2009 BMW Z4, a preview of the upcoming 2010 BMW Z8, and photos of the BMW Concept CS.
There were a couple of overlaps between the three blog search websites as well. “BMW Reveals Hydrogen 7” came up on all three websites. “BMW X6 without disguise” and “Future-cars.info20.com - BMW H2R On Discovery Channel” came up only on Google Blog Search and Bloglines. From this, it can be deduced that all three of these blog search websites will generally come up with some of the same results in most instances. However, it’s the amount of garbage that you get in the results that makes the difference between a poor blog search engine and a dependable blog search engine.
Posted by mjiao at 03:19 PM | Comments (0)
September 19, 2007
Giving Technorati a Second Shot
Last night when trying to access Technorati at approximately 7:30 PM, the web site failed to load. Today, it seems to be up and running again. Perhaps they were fixing bugs or updating at the time.
From a first glance, Technorati seems to have a green-themed layout which is pleasing to the eye. There is usually an advertisement banner across the top of the page. They list new blog posts and categorize them by entertainment, technology, politics, sports, business, and life; this is updated several times each minute. They also list the most popular search terms and tags on the right-hand side of the page. Under that is a section for featured bloggers.
Aside from the satisfying layout, the search function is not 100% effective. For example, when I searched “college humor” hoping to find some blogs on the popular college-based web site, my first search result was “FREE SHOCKING PORN MOVIE” (5 minutes ago in uzdhhutp). I was pretty disappointed at this point. Furthermore, to add insult to injury so-to-say, advertisements littered the top and right-hand side of the page.
I decided to give it another try by searching “new bmw m3,” as I am an avid car fan and have been following the new BMW M3 for quite some time. This time, search results were much more relevant. All the videos and pictures returned were exceptional but not all post or blog results had to do with the search query. Also, some search queries took more than 5 seconds to return results.
The advanced search is extensive and lets you specify a query that contains all of the words, the exact phrase, at least one of the words, or none of the words. You have the option of searching all blogs, blogs about a certain topic, or a specific blog URL. Additionally, you can search tags and their blog directory.
There are three tabs under the Technorati banner: Home, Popular, and Topics. I found the Popular tab to be useful; it leads you to the top favorited blogs, top searches, and top blogs. This is a good way to stay up-to-date in the blogging realm. For the time being, I think that the Home and Topics tabs deliver the same page. They both lead to new blog posts which are updated several times each minute.
If you sign up with Technorati, you can claim your own blogs and create your own profile on their web site. Membership looks to be free and relatively straightforward.
Overall, I do not believe that I would enjoy using Technorati on a daily basis. Advertisements, false/offensive search results, and registration deter me from this web site. If you enjoy an efficient advanced search with updates from the newest blogs every minute, then Technorati might be for you.
Posted by mjiao at 12:28 PM | Comments (0)
September 18, 2007
Searchable Feed Databases
Blogdigger’s layout looks similar to Altavista’s homepage. Search results are shown in a Google-like format. Each search result has a title, brief section of the page, and a place that says Feed, Focus, and Exclude. Feed gives you the RSS feed for the page, Focus gives you all posts from the feed, and Exclude gets rid of all the posts from this source from the current search results. It allows you to search by date or relevance. You can also subscribe to the whole search on the right-hand side of the screen via My Yahoo, Google, myAOL, Bloglines, or NewsGator.
Google Blog Search looks very similar to Google’s normal search engine. You can either search blogs or search the web. On the left-hand side of the screen, you can select which results you want to see in terms of when it was published and you can also subscribe to blogs via RSS or Atom. Search results include the title, brief section of the page, and References which doesn’t seem to do much at the moment. It is a slight hassle to visit the page to subscribe to the RSS feed. Google Blog Search also lists related blogs to your search at the top of the results. Furthermore, you can do an Advanced Blog Search where you specify what query you want, what to exclude, authors, date written, etc.
Technorati’s web page did not seem to load at 7:30 PM on September 18, 2007. When attempting to access their webpage later at 8:00 PM, it said “Doh! The Technorati Monster escaped again.” and “We're scouring the blogosphere attempting to find it. Back in a flash!”
BlogPulse has many components on its home page. On the left-hand side, there are top blog posts, top videos, top news stories, and RSS feeds. On the right-hand side, there is a graph showing percentage of posts by topic, web site statistics, and an informational section. Down the center of the page, there is a search query box and a section with updated information on recent activity in the blog universe. Search results have a title, brief section of the page, posting date, and time discovered. There is also a place where you can view the blog profile to see the blog title, blog URL, and rank. Search results are also constantly changing and updated by the minute. You can get a feed for the search. Like Google Blog Search, it is a slight hassle to visit the page to subscribe to the RSS feed. The advanced search on BlogPulse lets you specify a specific query, or you can create a Boolean query using the operators AND, OR, and NOT. You can also select the date range and sort the results by date or relevance.
Rojo’s home page is unique. There is a category section on the left-hand side and keyboard shortcuts on the right-hand side. Also, the most popular or recent blogs seem to be listed at the center of the page. In order to use Rojo at its full potential, you would need to create an account. Each search result has its own title, summary, and buttons where you can flag, e-mail, Rojolink, or tag it. Even if you are not a registered member of Rojo, you can still subscribe to its feeds.
IceRocket’s layout is relatively simple. You can search for blogs, the web, MySpace (which I thought was pretty interesting), news, or images. There is a link for the most blogged-about movies and also competitions about who has more blog buzz. On the search results page, hot topics are listed across the top. Each search result has a title, brief section of the page, word count, and the ability to subscribe, share, focus, or exclude. Focus lets you see all the posts from the blog and exclude gets rid of all the posts from that source from the current results. There are sponsored links on the top and right-hand side of the page. The advanced search options are very similar to Google Blog Search’s advanced search. Furthermore, you can subscribe to the search results.
If I had to choose one of these blog search engines, I would choose none of them. In my opinion, they all have their respective cons which due to repeated use, will begin to irritate me. For example, Blogdigger and IceRocket’s web sites are not visually appealing. You must travel to the actual blog site to get an RSS feed with Google Blog Search and BlogPulse’s search results. Technorati’s web site did not load at all. Lastly, in order to get the most out of Rojo, you must subscribe to their web site. Currently, if I wanted to search a topic and subscribe to some RSS feeds, I would use Bloglines. The layout of that web site is straightforward and relatively easy to use. So far, Bloglines is my top choice.
Posted by mjiao at 08:41 PM | Comments (0)
September 12, 2007
Seven Different Searches
When searching for “timber industry california” with Google on Wednesday September 12, 2007, it yielded about 1,980,000 results. Similarly, there were about 144 results from Yahoo Directory, 2,940,000 results from Yahoo Web, 106,066 results from Scirus, 27,500 results from Google Scholar, 68 results from UM Library’s Search Tools, and 13 results from CompletePlanet.
I found it interesting that some of the search results from Google and Yahoo Web were the same. For example, both search engines returned “Timber Industry Influence” and “Disaster plan for timber industry” on their first page of results. On an even more interesting note, Yahoo returned roughly 1,000,000 more results 0.06 seconds faster, however, the relevance of each of those results is questionable. Both search engines included local results in California. Google had some search results that were completely irrelevant to the search query. For example, Timberland, which is a manufacturer for outdoor wear, was under local business results. Both pages also have a section for sponsored results where companies pay to have their advertisements shown. Personally, I found these two search engines to be somewhat aesthetically similar.
Yahoo Directory returned scarce results but each result seemed to be more relevant to the search topic. Almost all results on the first page had to deal with both timber and California. For example, “California Cascade Industries” and “California Forest Association” were returned. Also, almost all of these search results were company web sites.
In my opinion, Scirus probably returned the most relevant results for the search topic (probably because “and” is default on this search web site and they deal with mostly scientific information). Every result on the first page dealt with both the topic of timber and California. Scirus also divided and categorized the search results into journal results, preferred web results, and other web results. This can be helpful for further narrowing down sources of information. Additionally, it has a sponsored links section and recommended search refinements with keywords on the right side of the page.
Google Scholar returned books which were relevant to the search topic. The search results can be divided up by authors on the left side of the page. I scarcely use Google Scholar, but it looks to be a good tool for doing research on a specific topic when looking for books or other physical articles of information. Also, I realized that there were no sponsored links on this page.
Truthfully, I was a little disappointed with UM Library’s Search Tools. It is similar to Google Scholar in that it returns results from physical articles of information, but it yielded less results and it is a large hassle to actually get to the information. Additionally, not all of the results could be retrieved; only 52 out of 68 results could have been retrieved when I searched “timber industry california”.
CompletePlanet returned some relevant results as well, but I do not believe that it performed better than Scirus. Only 13 results were returned and most of them were not pertinent to the search query. Also, I found it to be a little ironic that they had ads by Google on the right side of the results page.
Overall, I think that the deep search engines such as Scirus performed better if you were looking for research information or citable information such as a book. However, when searching for quick information or facts, Google and Yahoo Web did perfectly fine, although it did take a bit of more digging in each search engine to get the best results.
Posted by mjiao at 12:12 PM | Comments (0)
September 09, 2007
What Do I Want To Get Out Of This Class?
Although I believe that I am an adept computer user, there is always more to learn as technology improves rapidly on a daily basis. Both my parents are senior computer programmers and I have been brought up as somewhat of a computer engineer.
I frequently use the Yahoo and Google search engines while surfing the internet. Using them more efficiently would save me valuable time and help me become an even better asset for future companies that I would be working for. Improving upon my computer skills and having a class that deals heavily with the internet will better my performance in the workforce, as I believe that informational research is critical to any business-related career.
More specifically, I want to increase my knowledge in RSS feeds and web blogs. I first learned about RSS feeds in BIT 200 and I would like to learn more about them in this class. I would also like to know how to utilize an e-mail alert system. Lastly, I would like to learn how to use a wiki because I visit Wikipedia web sites almost everyday without knowing how they are revised and edited.
I look forward to being an active member of this class so I can accomplish all of my goals and make it a successful course.
Posted by mjiao at 08:17 PM | Comments (0)