Framework to Analyze Object Oriented Model

Abstraction: – The importance of mold is evident with its usage to foretell cost and clip demand particularly in finishing a system. However, there is no tantamount criterion for measuring the quality of conceptual theoretical accounts. Thomasson has besides shows the troubles in planing the appropriate UML category diagram such as calling the notation component. The UML category diagram designed by pupils ever neglects the quality in patterning such as consistence and truth. Our survey proposes the usage of WordNet in order to accomplish the quality in patterning. In order to pull out synsets from WordNet, we use Rita.WordNet as a tool. The usage of RiTa.WordNet shows that synonyms extracted can be used to fit the UML category name designed by pupils and this will be used to increase the truth of object-oriented theoretical account.

Key-Words: – Model ; Object-oriented Model ; Pattern-based Extraction ( PbE ) ; Rita.WordNet ; Synsets Extraction ; WordNet

1 Introduction

A model is a basic conceptual construction that used to work out or turn to complex issues, normally a set of tools, stuffs or constituents. It is besides a reclaimable and half-complete application which used for bring forthing other applications [ 1, 2 ] . In a package context, the model is used as a name for different sort of toolsets. Presently, models are most normally represented through design diagrams written in standard object-oriented analysis and design linguistic communications. Frameworks model a specific sphere or an of import facet thereof [ 3 ] . They represent the sphere as an abstract design, dwelling of abstract categories ( or interfaces ) . The abstract design is more than a set of categories, because it defines how cases of the categories are allowed to join forces with each other at runtime. Efficaciously, it acts as a skeleton, or stuff that determines how framework objects relate to each other.

Towards the system completion, mold is indispensable to foretell cost and clip demand. However, there is no tantamount criterion for measuring the quality of conceptual theoretical accounts. The traditional focal point of package quality shows that merely concluding merchandise has been measuring. The chief undertaking in object-oriented is concentrated on the building of a theoretical account of a job sphere, instead than package execution. Bettering quality of conceptual theoretical accounts is every bit of import as to better quality of delivered system [ 4 ] . Traditionally, system mold can be represented by text or diagrammatically. Nevertheless, statement through this attack can work out some jobs related to understanding the system demands. This scenario will emerge the misinterpretation between user and system applied scientist. This will besides do a system inconsistent [ 5 ] .

Because of its important popularity and is the de facto for patterning package architecture and design, Unified Modeling Language ( UML ) was adopted as a criterion by the Object Management Group ( OMG ) in November 1997 and now serves as the standard linguistic communication of designs for package [ 6 ] . In mold, planing the UML category diagram is an of import stage. Nevertheless, the UML deficiencies of formal semantics, i.e. the significance of the elements of a UML theoretical account is non officially defined and may depend on the reading of persons who are utilizing the UML [ 7 ] . Measuring some jobs in patterning, research from Thomasson shows the troubles in planing the appropriate UML category diagram [ 8 ] . They are: –

The fluctuation of the design signifier.

Naming the notation component.

Free in planing.

Difficult to province the category or object.

Difficult to lucubrate the demand.

The UML category diagrams designed by pupils ever neglect the quality in patterning such as consistence and truth. This should be overcome to do certain that there is no extra in category naming and the heritage relationship is valid. For this intent, a model will be developed to get the better of the incompatibility job in UML category diagram [ 9 ] .

The balance of the paper is organized as follows. Section 2 discusses about the background of this research. The following subdivision describes about method that is used in our survey. Section 4 discusses the model and nowadayss result proving in Section 5. In subdivision 6, we give the consequences of our survey and the treatment in subdivision 7. Section 8 summarizes our survey and points out some future research issues.

2 Background

Many ways have been done to bring forth a good system or theoretical account. One of that is by utilizing tools such as CONCEIVER++ , an understanding-based plan debugger for object-oriented scheduling linguistic communication [ 10 ] . However, analysis is besides an of import phase because the conceptual theoretical account can be shown to carry through the demands and becomes the skeleton to construct a complete system. Without thorough analysis, it is impossible to hold a good design or right execution [ 11 ] . In mold, analysis of the semantic quality can assist pupils in planing the UML category diagram. By making semantic analysis, we delve even deeper to look into whether they form a reasonable set of direction in the scheduling linguistic communication. This will assist pupils to get the better of the fluctuation of calling the notation component because it is semantically valid.

2.1 WordNet Background

Inherent to the engineering betterment and realisation of the importance of semantic, WordNet [ 12 ] has been developed as a great tool which gives a great impact to the educational universe particularly in semantic field. Few surveies were published on utilizing WordNet for instruction [ 13 ] or in specialised spheres [ 14 ] . They are a batch of surveies for bring forthing assortment linguistic communications of WordNet such as Malay WordNet [ 15 ] , Thai WordNet [ 16 ] , EuroWordNet [ 17 ] and many more. This shows that WordNet has been accepted all over the universe. WordNet has been released in some versions. Every new versions show the increasing figure of words. Table 1 and Table 2 show the statistical overview of WordNet from version 1.6 to 3.0.

Table 1 Amount of words and synsets in WordNet [ 18 ]

WordNet version

# Noun Synsets

# Verb Synsets

# Noun

# Verb

1.6

66,025

12,127

94,474

10,319

1.7

75,084

13,214

109,195

11,088

2.0

79,689

13,508

114,648

11,306

Table 2 Amount of words and synsets in WordNet enhanced from Table 1

WordNet version

# Noun Synsets

# Verb Synsets

# Noun

# Verb

2.1

81,426

13,650

117,097

11,488

3.0

82,115

13,767

117,798

11,529

Statistical overview of WordNet versions are of import particularly for those who are interested in developing another linguistic communication of WordNet such as Chinese WordNet, Thai WordNet, EuroWordNet and many more. To better WordNet and its widespread applications, there is still a batch of plants to make. Each versions of WordNet noted the increased figure of NounSynsets, VerbSynset, Noun and Verb. This proves that WordNet continues to turn which is an advantage of an on-line database. In Figure 1, the escalation of figure of synsets and words can be seen clearly from WordNet version 1.6 to version 3.0.

Fig 1 Escalation of WordNet synsets and words

2.2 WordNet Concept

WordNet is a big lexical database of ( any ) linguistic communication. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive equivalent word ( synsets ) , each showing a distinguishable construct. Synsets are interlinked by agencies of conceptual-semantic and lexical dealingss. The WordNet is a system for conveying together different lexical and semantic dealingss between the words. It organizes the lexical information in footings of word significances and can be termed as a vocabulary based on psycholinguistic rules.

WordNet Search Results

The consequences of a hunt of the WordNet database are displayed in the Results Window. Horizontal and perpendicular coil bars are present for scrolling through the hunt consequences. All searches other than overview list all senses fiting the hunt consequences in the undermentioned general format.

Items enclosed in italicized square brackets ( [ aˆ¦ ] ) may non be present

If a hunt can non be performed on some senses of searchstr, the hunt consequences are headed by a twine of the signifier: Ten of Y senses of searchstr.

One line naming the figure of senses fiting the hunt selected

Sense fiting hunt show

Sense N

[ { synset_offset } ]

[ & lt ; lex_filename & gt ; ] word1 [ # sense_number ] [ , word2aˆ¦ . ]

In WordNet hunt consequences, these lexicographer files will look when some options are selected for drawn-out position. These lexicographer files will be processed by swot [ 19 ] which so will bring forth a database suitable for usage with the WordNet library, interface codification, and other applications. The format of the lexicologist files is described in wninput [ 20 ] . A file figure corresponds to each lexicologist file. File Numberss are encoded in several parts of the WordNet system as an efficient manner to bespeak a lexicologist file name. The file lexnames [ 19 ] lists the function between file names and Numberss, and can be used by plans or terminal users to correlate the two.

As a instance survey of this research, the reappraisal of WordNet itself is really of import. In this research, WordNet will be used to pull out its synsets for object-oriented theoretical account. WordNet is well-known package used for many researches. The combination of synonym finder and dictionary makes it really utile particularly in semantic sphere. One of the outstanding illustrations of the usage of WordNet is to find the similarity between words. Assorted algorithms have been proposed, and these include sing the distance between the conceptual classs of words, every bit good as sing the hierarchal construction of the WordNet ontology. A figure of these WordNet-based word similarity algorithms are implemented in a Perl bundle called WordNet: :Similarity [ 21 ] and in Phyton bundle called NLTK.

2.4 WordNet Application

There are assorted applications of WordNet particularly in footings of Natural Language Processing which is mentioned to be the most successful applications of WordNet. The consequences in Figure 2 is the hunt tally on the bibliographic database LISA, INSPEC, IEEE, ResearchIndex and on the Universidad Carlos III ‘s OPAC. This show the paperss about WordNet published from 1994 boulder clay 2003 [ 22 ] .

Fig 2 WordNet Applications

Based on Figure 2 above, application of WordNet in information retrieval besides got high ranking among other research. This shows that research in information retrieval and WordNet got attending from old surveies. We interested to concentrate more on application of WordNet in information retrieval and extraction. Previous researches show that WordNet has been used for different intents such as:

As semantic vocabulary in a faculty for full text message retrieval in communicating assistance [ 23 ] .

As a lingual cognition tool [ 24 ] .

As a tool for the automatic building of synonym finder [ 25, 26 ] .

Expand questions to optimise the preciseness of Internet hunt engines in the development of a natural linguistic communication interface [ 27 ] .

Some research utilizing WordNet for extraction has been done twelvemonth by twelvemonth since its development. There are research that use WordNet as an ontology, as a tool and many more. One sphere has been chosen for farther reading Table 3 shows several researches that relate WordNet and extraction sorted by twelvemonth. These researches help a batch in pull outing WordNet synsets.

Table 3 Research on WordNet and Extraction

Writer

Research treatment

Hearst, M.A. [ 28 ]

Describe a method for the acquisition of the hyponymy lexical relation from unrestricted text.

Discuss 2 ends for this attack:

Avoidance of the demand from pre-encoded cognition.

Applicability across a broad scope of text.

Discuss the construct of subordinate.

Pedersen, T. et Al. [ 29 ]

Compare 3 unsupervised acquisition that distinguish the sense of an equivocal word in unlabeled text

McQuitty ‘s similarity analysis [ 30 ] .

Ward ‘s minimum-variance method [ 31 ] .

EM algorithm [ 32 ] .

The most accurate of these processs in McQuitty ‘s similarity analysis in combination with a high dimensional characteristic set.

Kaplan, A. N. [ 33 ]

This study describes the efforts to get at a quantitative step of the quality of the information that can be extracted from WordNet by construing it as a formal taxonomy, and to plan automatic techniques for bettering the quality by filtrating out doubtful averments.

Pearce, D. [ 34 ]

This paper describes the usage of WordNet in a new technique for collocation extraction.

Some bing extraction techniques have been discussed.

Choueka [ 35 ] : N-grams from 2 to 6 words in length.

Church and Hanks [ 36 ] : describes techniques that used common information to mensurate the strength of association between words.

Smadja [ 37 ] : infer sentence structure by mensurating the spread of the distribution of counts between the two collocates.

Lin [ 38 ] : bases his extraction method on dependence three-base hits obtained from a shallow-parsed text principal.

Katz [ 39 ] : usage forms of parts-of-speech to pull out proficient footings ( closely related to collocations ) .

The writer besides produced a new definition of a two-word collocation which is contrasts to others.

Gomes, P.J.d.s [ 40 ]

REBUILDER system is a first measure to the development of a commercial CASE tool that addresses the support of package design and design cognition direction

El-Kahlout, I.d. et Al. [ 41 ]

Presents a Meaning To Word System ( MTW ) for Turkish Language

Find a set of words, closely fiting the definition entered by the user.

Extracting words from “ significance ” is based on look intoing the similarity between the user ‘s definition and each entry of the Turkish database without sing any semantics or grammatical information.

MTW for Turkish to happen the appropriate words whose definitions match the given definition.

2 jobs:

Locating a figure of campaigner words whose definitions are “ similar ” to the definition in some sense.

Ranking these campaigner words utilizing a assortment of ways to return a list sorted in footings of similarity

Extracting words from intending

Checking the similarity between the user definition and each entry of the dictionary by doing a figure analyses without see the semantics of the contexts.

Use NLP techniques to heighten the effectivity of term-based information retrieval.

3 Proposed Method

In this subdivision, we analyzed bing methods that have been used for pull outing equivalent word from dictionary definition. From some methods that have been reviewed,

the most suited method to use for this research is Pattern-based Extraction ( PbE ) that can be used to pull out and fit synsets with the replies scheme prepared by lector.

We propose to accommodate this method for pull outing synsetss from WordNet to get the better of some jobs in UML theoretical account. Some features of PbE and the significance of taking it as a method in this research will be discussed.

3.1 Concept of Pattern-based Extraction ( PbE )

Because of its methodicalness in the composing of dictionary definition text, pattern-based extraction ( PbE ) has been chose to be one of the methods to use in this survey. One of the aims of PbE is to detect peculiar forms synonyms that tend to follow in definition texts [ 42 ] . Therefore, research from [ 42, 43 ] focal point

more on those definientia that follow such forms.

It is based on regularity in dictionary definition text composing. PbE focus more on those definientia that follow peculiar forms synonyms tend to follow in definition texts.

3.2 Algorithm for Synonym Extraction

The algorithm for synonym extraction really constructing a dictionary graph in which a definiendum ( the word being defined ) is related to merely those definientia ( specifying words ) following specific forms. In PbE, it takes merely those following certain pre-specified forms.

For synonyms extraction, the measure by measure algorithm will be discussed as below.

Simple Pattern-based Extraction Rule [ 42, 43 ]

1: resultSet i?Y { }

2: newlyExtractedSet i?Y { targetWord }

3: repetition

4: for all tungsten in newlyExtractedSet do

5: for all definition of tungsten do

6: for all P in PbEPatterns do

7: if definition lucifers p on s so

8: attention deficit disorder s to newlyExtractedSet

9: terminal if

10: terminal for

11: terminal for

12: take tungsten from newlyExtractedSet

13: terminal for

14: attention deficit disorder newlyExtractedSet to resultSet

15: until newlyExtractedSet is empty

16: return resultSet

Given W0 and P0, we now follow this process for synonym extraction:

If any word tungsten matches any form P ?„ Pi, extract tungsten as equivalent word of T and update the word list Wi = Wi U { tungsten } .

If t lucifers any form p0 ?„ P in the definition text of some other word w0, extract w0 as equivalent word of T and update the word list Wi = Wi U { tungsten } .

Take each word tungsten ?„ Wi as mark word and repetition 1 and 2 ; add all ensuing equivalent word to Wi and denote the new set Wi+1.

3.3 Pattern Bootstrapping

Wordss in Wi+1 are assumed to look in each other ‘s synonym sets in forms other than the 1s started with in Pi. The regex set Pi is updated by adding these new forms, and repetition synonym extraction with Wi+1 and Pi+1.

3.4 Application

Based on [ 43 ] , PbE has been used in several experiments for pull outing equivalent word from dictionary definitions. PbE has been tested for work outing TOEFL equivalent word inquiries, comparing with bing synonym finder and besides labeling equivalent word in definition. Based on these three experiments, PbE has shown the best public presentation compared to two other methods, Inverted Index Extraction ( IIE ) and Maximum Entropy ( MaxEnt ) . We are interested to concentrate more on comparing against bing thesauri experiment. The end of this experiment is to measure the grade of synonymity among the extracted words. Here, the writer focal point on comparings between IIE and PbE and how their public presentation varies harmonizing to aim word frequence as shown in Figure 3.

Fig 3 Inverted index extraction versus pattern-based extraction when compared with bing synonym finder. High, Medium, and Low refer to different frequences of mark words in the Wall Street Journal. Adapted from [ 43 ]

From this experiment, PbE has somewhat better preciseness and drastically better callback, ensuing in F tonss about 3-5 times every bit high as those of IIE.

Based on the public presentation of PbE in old experiments, we choose PbE as our proposed method in fiting the extracted synsets with the specific form. We propose to utilize Regular Expression as the form for fiting the extracted synsets with the replies scheme prepared by lector. We analyze the measure by measure algorithm that have been set up for PbE and adapted them in our research. The three stairss for synonym extraction algorithm can be adapted for fiting synsets with the replies scheme. We merely need to do some alteration in the form for synsets to be lucifer.

4 Framework to Analyze Semantic of Object-Oriented Model ( FASOOM )

By and large, this model consists of five stages. The inside informations of every stage will be discussed below.

Phase 1

This is where the input is given. The inputs are pupils ‘ replies which used to fit with replies scheme prepared by lector. The replies here means UML category name. If category name answered by pupils are match with replies scheme, the replies given by pupils are considered to be true. If non, it will go on to the 2nd stage.

Phase 2

Phase 2 is applied when the replies given by pupils are non match with replies scheme. Synsets will be pull outing from WordNet to fit both replies. In this stage, we used Rita.WordNet as one of the nucleus objects in RiTa tools to acquire all the synsets for object-oriented theoretical account. The synsets extracted from this procedure will so be used in the following stage for fiting procedure.

Phase 3

Phase 3 is fiting procedure which match the synsets extracted from the old stage with the replies scheme that have been used in the first stage.

Phase 4

If the one of the synsets extracted and the replies strategy are matched, the synsets will so be stored as lucifer word ( category name ) and updated in the knowledge-based.

Phase 5

Finally, all synsets stored in the knowledge-based will be the end products for pupils ‘ replies and pupils ‘ replies are considerable as true because they match the synsets. Phase by stage of the model can be seen clearly in Figure 4.

Fig 4 FASOOM Framework

5 Execution

A batch of research has been done on object-oriented plans proving. However, few of them address the jobs related with integrating proving [ 44-46 ] . In other research, tool has been developed for proving object-oriented plans at the integrating degree, carry throughing the mechanization of both trial instance coevals and trial executing [ 47 ] . A trial mechanization model is a set of premises, constructs or tools that provide support for automated package proving [ 48-50 ] . The Testing model is responsible for [ 51 ] :

specifying the format in which to show outlooks

making a mechanism to hook into or drive the application under trial

put to deathing the trials

coverage consequences

In this research, a tool is needed to turn out the FASOOM model. A tool that has been chosen for this research is RiTa [ 52 ] . RiTa covers a scope of computational undertakings related to literary pattern, including text analysis, coevals, show and life, text-to-speech, text-mining, and entree to external resources ( e.g. , WordNet ) . RiTa.WordNet ( RiWordNet ) is one of the nucleus objects in RiTa toolkit that support structures for a specific undertaking. Briefly, RiWordNet is an intuitive interface to the WordNet ontology supplying definitions, rubrics, and a scope of -onyms ( superordinate, subordinate, equivalent word, opposite word, part name, etc. ) . It can be transparently bundled into a web-based, browser-executable plan. RiWordNet besides used for accessing WordNet via the RiTaServer. For most instances, it is simpler to merely utilize the RiWordNet.

6 Consequences

In this research, equivalent word are of import to do certain that replies given by pupils are true. For the early phase, we design a simple plan to pull out synonym sets ( synsets ) utilizing Rita.WordNet. Several words that normally used as category name in UML have been tested and the end products are shown in Table 4 below.

Table 4 Synsets Output from Rita.WordNet

Search words

Synsets

history

Synsets 0: accounting

Synsets 1: measure

Synsets 2: history

Synsets 3: account

Synsets 4: history

Synsets 5: bill

Synsets 6: study

Synsets 7: mark

Synsets 8: narrative

client

Synsets 0: client

client

Synsets 0: client

Synsets 1: invitee

Synsets 2: node

dealing

Synsets 0: dealing

Synsets 1: traffics

money

No Synsets!

Initially, we test merely five words that are normally used for UML category name. They are history, client, client, dealing and money. Customer and client really are two words those equivalent words to each other. We test these two words to see that if they can give the same end product.

7 Discussion

As shown in Table 4, a hunt word can gives several synsets or no synsets at all. This provides several picks for pupils in calling the UML category diagram right even though they are non provided in replies scheme. Rita.WordNet has been used to pull out synsets from WordNet. Other extractor tools such as TextCatch [ 53 ] , TextToOnto [ 54 ] and Email Extractor [ 55 ] have been reviewed and evidently Rita.WordNet is the best pick. The used of RiWordNet are because of its important in this research. We can reason that RiWordNet are:

Easy to utilize and understand

Access straight to WordNet database through RiTaServer.

Give the simple end product as requested

Easy to custom-make with ain demand ( utilize Java scheduling )

Even though RiTa.WordNet gives a batch of benefits compared to other tools, some jobs appear when we search the words in the same synsets. As an illustration in Table 3, when we search for the word client, merely client has appeared to be the synsets for client. But, when we search for the word client, there are two other more synsets such as invitee and node. If client is a equivalent word to client and invitee is a equivalent word to client, we can state that the word invitee is besides synonym to client. However, the consequences that we have gained show that RiTa.WordNet can non give the overall synsets for the hunt words. So, some alteration demands to be done in order to accomplish the all synsets for a hunt word.

8 Conclusion and Future Work

The result from this survey is a new model introduced to analyse semantic of object-oriented theoretical account. To implement this model, a tool called Rita.WordNet are adapted to acquire synsets from WordNet and the synsets so matched with the input given by user. The infusion and fit measure are really of import to analyse the semantic in the proposed model.

By this development, users do non necessitate to seek their equivalent word information straight from WordNet. This model development helps pupils particularly, in their searching of related significance in the particular sphere ( particularly in patterning sphere ) . In add-on, pupils will salvage clip on finishing their undertakings with accurate result. We believe that this survey will give a good impact in version of WordNet in the other manner.

Future work will concentrate on deriving more synsets that normally used for UML category name utilizing Rita.WordNet. For this phase, we will happen solution to acquire more synsets for the hunt words.

9 Recognition

This research was supported by a grant Tabung Bantuan Pendidikan Khas ( TBPK ) , Universiti Malaysia Terengganu ( Vot: 53057 ) and National Science Fellowship ( NSF ) under Ministry of Science, Technology and Innovation ( MOSTI ) Malaysia.

Leave a Reply

Your email address will not be published. Required fields are marked *