You are on page 1of 12

CAP 214: Fundamentals of Web

Programming

Homework: 1

Submitted by: -

Name: Aparnesh Roy

Course: BCA-MCA (Dual Degree)

Section: D3901

Roll No.: RD3901A32

Group: 1

Submitted to: - Lect. Rajeev Kanday


Part: - “A”
1. What are the various components of
a search engine? What steps need to be
followed to get good search results?
Ans.: There are mainly four components to a search
engine which will affect the results of your search:
1) The WebCrawler
This is the part of the search engine which combs
through the pages on the internet and gathers the
information for the search engine. Variable features
which can affect your search results include:
− Included pages
Most search engines will find information by
beginning at one page and then following all of
the links on that page. It will then follow all of
the links on these new pages, and so on.
Therefore, if a page is not linked to from
another page, it may never be found by a
search engine.
− Excluded pages
Many web pages are also excluded because
their content is dynamically generated from a
database and a search engine cannot find it.
− Documents types
Different search engines will search different
document types. All will search HTML
documents, but some will also search PDF,
PowerPoint, Word, Excel, and more.
− Frequency of crawling
An important part of a web crawler is how
frequently it retrieves information from pages.
Some sites it will visit more often than others.

2) The Database
The search engine’s database is what you are
actually searching. All of the information that a
WebCrawler retrieves is stored in a database. Every
time you use a search engine, it is this database
you are searching, not the live internet. Variable
features which can affect your search results
include:
− Size of the database
Some search engines will have extremely large
databases (~5 billion pages) for you to search,
while others will have comparatively small ones
(~150 million pages). A small database is not
necessarily worse, however, since it may offer
more focused and higher quality results than a
larger database.
− Freshness of the database
The freshness of the database is a direct result
of how frequently the web crawler retrieves
new information. If the information in the
database is fairly old, then your search results
will suffer.
3) The Search algorithm
Each search engine interprets the terms you enter
into the search box in different ways. Variable
features which can affect your search results
include:

− Operators
Most search engines allow you to use operators
such as AND, OR, and NOT in order to create
complex search statements. The terms may
need to be entered in upper case. Most also use
the plus sign to signal that a term must be
included in the search results and a minus sign
to signal that a term must be excluded from the
results.
− Phrase Searching
Search engines will generally search for words
as phrases when quotation marks are placed
around the phrase. Some search engines may
use a drop down menu which offers the option
of searching the terms as a phrase.
− Truncation
Some search engines will automatically
truncate the terms you enter. This means that
the search engine will not only search for the
term exactly as you spelled it, but will also
search on that term with alternate endings and
as a plural. Some search engines will only
search for variable endings on certain common
words.
4) The Ranking algorithm
Most searches will retrieve thousands of results.
Since you probably will only look through the first 1-
2 pages of results, you need the most relevant
results to appear first. Variable features which can
affect your search results include:
− Location and Frequency
All search engines look at the location and
frequency of words in a page. If a term appears
near the top of a web page, such as in the title
or in the first few paragraphs of text, it is
assumed that the page is more relevant than if
the term is used at the bottom of the page.
Pages where the words appear more frequently
in relation to the other words on the page also
qualifies the page as being more relevant than
other web pages.
− Link Analysis
This feature analyses how pages link to each
other and then uses this information to
determine the “importance” of each page. If a
page is linked to from a large number of other
pages, then it are ranked more highly.
− Click through measurement
Some search engines also use click through
analysis. This means that a search engine
might watch what results someone selects from
a particular search, then eventually drop high-
ranking pages that aren't attracting clicks,
while promoting lower-ranking pages that do
pull in visitors.

Following the 10 steps will also ensure good results if


your search is multifaceted and you want to get the
most relevant results.
The 10 steps are as follows:
1. Identify the important concepts of your search.
2. Choose the keywords that describe these concepts.
3. Determine whether there are synonyms, related
terms, or other variations of the keywords that should
be included.
4. Determine which search features may apply,
including truncation, proximity operators, Boolean
operators, and so forth.
5. Choose a search engine.
6. Read the search instructions on the search engine's
home page. Look for sections entitled "Help," "Advanced
Search," "Frequently Asked Questions," and so forth.
7. Create a search expression, using syntax, which is
appropriate for the search engine.
8. Evaluate the results. How many hits were returned?
Were the results relevant to your query?
9. Modify your search if needed. Go back to steps 2-4
and revise your query accordingly.
10. Try the same search in a different search engine,
following steps 5-9 above.

2. What is the purpose of using FTP?


Differentiate between FTP and TELNET?
Ans.: FTP (File Transfer Protocol): File Transfer
Protocol (FTP) is a standard network protocol used to
copy a file from one host to another over a TCP/IP-
based network, such as the Internet. FTP is built on
client-server architecture and utilizes separate control
and data connections between the client and server
applications, which solve the problem of different end
host configurations (i.e., Operating Systems, file
names). FTP is used with user-based password
authentication or with anonymous user access.
Difference between FTP and TELNET is as
follows:
1) FTP is a protocol used specifically for transferring
files to a remote location, while Telnet allows a user to
issue commands remotely.
2) FTP can be used with a command line, a dedicated
application, and even with most web browsers, while
Telnet is restricted to the command line.
3) There are ways to use FTP in a secure environment,
while Telnet will always be unsecured.
4) FTP is a well-known and reliable method of
uploading files to web servers, while Telnet is now
commonly used in diagnosing network services.

3. Compare and contrast the various


facilities provided by the WWW?
Ans.: The facilities provided by the WWW (World
Wide Web) are as follows:
1)On-line communication
2)Software sharing
3)Exchange of views on topics of common interest
4)Posting of information of general interest
5)Feedback about products
6)Customer support service
7)On-line journals and magazines
8)On-line shopping
9)World-wide video conferencing

Part: - “B”
4. Why do we use MIME to extend the
format of email?
Ans.: MIME, also known as, Multipurpose Internet
Mail Extensions, helps the e-mail protocols to permit
the users to transfer any kind of image, sound, and
program as attachment in email across the World
Wide Web. It is because of MIME only that you can
send your audios, videos and software programs to
other users through email attachments as it is the
format of non-text email attachments.
MIME was introduced to extend the current SMTP, or
Simple Mail Transport Protocol, in order to send data
other than ASCII characters through various web
clients and web servers. It, now, provides the
following extensions to the current email:
1) Non-text attachments such as images, videos,
audios and other multi-media messages.
2) Ability to send multiple objects within a single
message.
3) Character sets other than US-ASCII.
4) Writing header information in non-ASCII character
sets.
5) Text with unlimited length.

5. . Why do browsers use plug-ins?


Differentiate between cookies and plug-
ins?
Ans.: In computing, a plug-in is a set of software
components that adds specific capabilities to a larger
software application. If supported, plug-ins enables
customizing the functionality of an application. For
example, plug-ins is commonly used in web browsers
to play video, scan for viruses, and display new file
types.
Applications support plug-ins for many reasons.
Some of the main reasons include:
* To enable third-party developers to create
capabilities which extend an application?
* To support easily adding new features
* To reduce the size of an application
* To separate source code from an application
because of incompatible software licenses.
Specific examples of applications and why they
use plug-ins:
* Email clients use plug-ins to decrypt and encrypt
email (Pretty Good Privacy)
* Graphics software use plug-ins to support file
formats and process images (Adobe Photoshop)
* Media players use plug-ins to support file formats
and apply filters (foobar2000, G Streamer,
Quintessential, VST, Win amp, and XMMS)
* Microsoft Office uses plug-ins (better known as
add-ins) to extend the capabilities of its application by
adding custom commands and specialized features
* Packet sniffers use plug-ins to decode packet
formats (Omni Peek)
* Remote sensing applications use plug-ins to
process data from different sensor types (Opticks)
* Software development environments use plug-ins
to support programming languages (Eclipse, jEdit,
MonoDevelop)
* Venue, a digital mixing console architecture
developed by Digidesign and owned by Avid
Technology, allows third party plug-ins
* Web browsers use plug-ins to play video and
presentation formats (Flash, QuickTime, Microsoft
Silverlight, 3DMLW)

Cookie:
A cookie, also known as a web cookie, browser cookie,
and HTTP cookie, is a piece of text stored by a user's
web browser. A cookie can be used for authentication,
storing site preferences, shopping cart contents, the
identifier for a server-based session, or anything else
that can be accomplished through storing text data.
A cookie consists of one or more name-value pairs
containing bits of information, which may be encrypted
for information privacy and data security purposes. The
cookie is sent as an HTTP header by a web server to a
web browser and then sent back unchanged by the
browser each time it accesses that server.
Cookies, as with a cache, can be cleared to restore file
storage space. If not manually deleted by the user,
cookies usually have an expiration date associated with
them (established by the server that set it). Once that
date has passed, the cookies stored by the client will
automatically be deleted.
As text, cookies are not executable. Because they are
not executed, they cannot replicate themselves and are
not viruses. However, due to the browser mechanism to
set and read cookies, they can be used as spyware. Anti-
spyware products may warn users about some cookies
because cookies can be used to track people—a privacy
concern, later causing possible malware.
6. Is it possible to post messages to the
USENET via email? Explain?
Ans.: Usenet lets us post news messages without
any central administrative server. The messaging and
collaboration system it uses can be described as a
clash between email and Web forums. Because the
system functions primarily through messaging, we
have the freedom of posting to Usenet through email
just as we would send an email to someone
personally.
Instructions to do this are as follows:
1)Open your favourite email application and log in to
your email account.
2)Start a new email addressed to one of the following
recipients: mail2news@anon.lcs.mit.edu, usenet-
gateway@mixmaster.shinn.net and
mail2news@dizum.com.
3)Write in the subject what you want to use as the
news topic.
4)Continue writing the email like a normal one and
send it. Your message will appear on Usenet as long
as there were no delivery problems.
: FINISH :

You might also like