gnunet-svn
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[GNUnet-SVN] [taler-schemafuzz] branch master updated: roughly finished


From: gnunet
Subject: [GNUnet-SVN] [taler-schemafuzz] branch master updated: roughly finished content
Date: Wed, 29 Aug 2018 12:14:52 +0200

This is an automated email from the git hooks/post-receive script.

erwan-ulrich pushed a commit to branch master
in repository schemafuzz.

The following commit(s) were added to refs/heads/master by this push:
     new af18947  roughly finished content
af18947 is described below

commit af18947454efaca45ccfee61a6dd4fc12027bf57
Author: Feideus <address@hidden>
AuthorDate: Wed Aug 29 12:14:48 2018 +0200

    roughly finished content
---
 Documentation.tex                           | 119 +++++++++++---------------
 Documentation.tex => docs/Documentation.tex | 126 ++++++++++++----------------
 docs/PersonnalExperience.tex                |  59 +++++++++++++
 3 files changed, 165 insertions(+), 139 deletions(-)

diff --git a/Documentation.tex b/Documentation.tex
index c184d64..1af7a17 100644
--- a/Documentation.tex
+++ b/Documentation.tex
@@ -6,9 +6,10 @@
 \usepackage{pifont}
 \graphicspath{{/home/feideu/Work/Gnunet/schemafuzz/docs/}}
 \usepackage{graphicx}
+\usepackage{pdfpages}
 \usepackage{emp}
 \usetikzlibrary{shapes.arrows,chains}
-\usepackage[ngerman]{babel}
+\usepackage[english]{babel}
 
 \title{Documentation for schemaFuzz}
 \author{Ulrich "Feideus" Erwan}
@@ -24,7 +25,6 @@
 SchemaFuzz is a free software command line tool incorporated inside the Gnu 
Taler package 
 which is a free software electronic payment system providing anonymity for 
customers.
 The main goal of this project is to provide an effecient debbuging tool that 
uses a "fuzzing" strategy oriented on databases.  
-Traditionnal fuzzing is defined as "testing an automated software testing 
technique that involves providing invalid, unexpected, or random data as inputs 
to a computer program". SchemaFuzz uses this principle and applies it to the 
database field.
 Where a traditionnal fuzzer would send malformed input to a program, 
SchemaFuzz modifies the content of a database to test that program's behavior 
when stumbling on such unexpected data. \\*
 Obviously, this tool is meant to be used as a mean of debugging as the goal is 
to pop buggs or put into light the security breaches that the code may contain 
regarding the retrieving, usage and saving of a database's content.
 As this tool is being developped as a master's thesis project, its current 
state is far from being finished and there are many options and optimisations 
that deserve to be implemented that are not yet available.
@@ -32,12 +32,34 @@ These future/missing features will be detailed and 
discussed in a dedicated sect
 
        
        \section{Context and Perimeter}
-SchemaFuzz's developpement enrolls in the global dynamic of the past decades 
regarding internet  that sustain great efforts to make it a more fluid, 
pleasant but more importantly a safer space. This tool is meant to help 
developpers, mainteners and more genericly anyone that makes use of database 
comming from a database under his influence in their task. A good way to 
summerise the effect of this tool is to compare it with an "cyber attack 
simulator".
+SchemaFuzz's developpement enrolls in the global dynamic of the past decades 
regarding internet  that sustain great efforts to make it a more fluid, 
pleasant but more importantly a safer space.
+
+It uses the principle of "fuzz testing" or "fuzzing" to help find out which 
are the weak code paths of one's project. 
+                               \begin{quotation}
+Traditionnal fuzzing is defined as "testing an automated software testing 
technique that involves providing invalid, unexpected, or random data as inputs 
to a computer program".
+                               \end{quotation}         
+
+This illustation is very well illustated by the following example :
+                               \begin{quotation}
+                               Lets's consider an integer in a program, which 
stores the result of a user's choice between 3 questions. When the user picks 
one, the choice will be 0, 1 or 2. Which makes three practical cases. But what 
if we transmit 3, or 255 ? We can, because integers are stored a static size 
variable. If the default switch case hasn't been implemented securely, the 
program may crash and lead to "classical" security issues: (un)exploitable 
buffer overflows, DoS, ... 
+                               \end{quotation}
+
+It is declined in severals categories that each focus on a specific type of 
input.
+ 
+UI fuzzing focuses on button sequences and more genericly any kind of user 
input during the execution of a program. The above exemple falls into this 
category.
+This principle had already successfully been used in existing fuzzing tool 
such as the well known "American fuzzy loop".
+File format fuzzing generates multiple malformed samples, and opens them 
sequentially.
+However, SchemaFuzz is a database oriented fuzzer. This means that it focuses 
on triggering unexpected behavior related to the usage of a external database 
content   
+
+This tool is meant to help developpers, mainteners and more genericly anyone 
that makes use of database comming from a database under his influence in their 
task. A good way to summerise the effect of this tool is to compare it with an 
"cyber attack simulator".
 This means that the idea behind it is to emulate the damage that an attacker 
may cause subtly or not to a database he unlegitly gained privileges on. This 
might in theory go from a simple boolean flip (subtle modifications) to 
removing/adding content to purely and simply destroying or erasing all the 
content of the database.
 SchemaFuzz focuses on the first part : modification of the content of the 
database by single small modification that may or may not overlap. These 
modifications may be very aggressive of very subtle.
 It is intresting to point out that this last point also qualifies SchemaFuzz 
as a good "database structural flaw detector".
 That is to say that errors typically triggered by a poor management of a 
database (wrong data type usage, incoherence beetween database structure and 
use of the content etc ...) might also appear clearly during the execution.   
                \subsection{Perimeter}
+This tool is based on some of the SchemaSpy tool's source code. More 
precisely, it uses the portion of the code that detect and stores the target 
database's structure.
+The main goal of this project is to build on top of this piece of existing 
code the functionnalities required to test the usage of the database content by 
any kind of utility.                
+The resulting software will generate a group of human readable reports on each 
modification that was performed.                
                \begin{figure} [htbp]
                \centering
                \includegraphics[scale=1]{codeOriginDiagram.pdf}
@@ -46,71 +68,8 @@ That is to say that errors typically triggered by a poor 
management of a databas
                \subsection{When to use it}
 SchemaFuzz is a very usefull tool for anyone trying secure a piece of software 
that uses database ressources. The target software should be GDB compatible and 
the DBMS has to grant access to the target database through credentials passed 
as argument to this tool.
 
----It is very strongly advice to use a copy of the target databas erather than 
on the production material. Doing so will very likely result in the database 
being corrupted and not usable for any usefull mean.--
- 
-       \section{Usage}
-               \subsection{prerequisites}
-                       SchemaFuzz requires the presence of a list of libraries 
to work properly which are :
-                       \begin{itemize}
-                       \item org.apache.commons.math3 >= 3.6
-                       available at \\*
-                       
\url{https://commons.apache.org/proper/commons-math/download_math.cgi}          
        
-                       \end{itemize}
-The library has to be installed in the maven repository to be available. The 
instructions detailed at the following address explain how to do that. futher 
information can be found on the official maven website.\\*
+---It is very strongly advice to use a copy of the target databas erather than 
on the production material. Doing so will very likely result in the database 
being corrupted and not usable for any usefull mean.
 
-                       
\url{https://www.mkyong.com/maven/how-to-include-library-manully-into-maven-local-repository/}
-                       
-               \subsection{setting up the code}
-                       Once all the depencies have been installed 
successfully, clone the source available on the official git taler repository 
\\*
-                       \url{https://git.taler.net/schemafuzz.git}
-                       \begin{verbatim}
-                        git clone https://git.taler.net/schemafuzz.git
-                       \end{verbatim}
-                       
-the folder containing the code shoud hold the rights for reading writing and 
executing (rwx) for the user that plans to run the tool.
-if this is not the case, you can give these rights like so
-                       \begin{verbatim}
-                       sudo chmod -R 700 schemafuzz
-                       \end{verbatim}
-               \subsection{Building}
-SchemaFuzz is using maven for building and library management purposes.
-Therefore, using the maven command line building script is way to go.
-Standard way of building :\\*
-                       \begin{verbatim}
-                       ./mvnw package
-                       \end{verbatim}
-                               
-This maven building method also offers alternative instructions for    more 
precise/refined way of building as well as compilation and test 
-launching options (those should only be intresting for the contributors).
-
-Launching the test suit :\\*
-                       \begin{verbatim}
-                       ./mvnw test
-                       \end{verbatim}
-Compiling the code :\\*                
-                       \begin{verbatim}
-                       ./mvnw compile
-                       \end{verbatim}
-               
-Other usefull commands: \\*            
-               
-                       \begin{verbatim}
-                       ./mvnw clean
-                       \end{verbatim}
-                       \begin{verbatim}
-                       ./mvnw validate
-                       \end{verbatim}
-                       \begin{verbatim}
-                       ./mvnw deploy
-                       \end{verbatim}
-               
-               \subsection{Setting up the database}    
-       
-Launch the "dbConfigure" script.
-                       \begin{verbatim}
-                               ./dbConfigure
-                       \end{verbatim}           
-       
        \section{Design}
                \subsection{Generic explanation}
 SchemaFuzz implementation is based on some bits of the SchemaSpy project 
source code.
@@ -314,13 +273,34 @@ The following points constitute the main flaw of the 
source code:
                        \item Hard to maintain. The code is not optimised 
either in term of size or                     efficency. Bad coding habits tend 
to make it rather weak and unstable to context changes.
                        \item Structure is not intuitive. The main loop of the 
program lacks a good             structure.
                        \end{itemize}
-                       
+
+       \section{Results and exemples}
+In the proccess of being written.
+
        \section{Upcomming features and changes}
 This section will provide more insights on the future features that 
might/may/will be implemented as well as the changes in the existing code.
 Any sugestion will be greatly appriciated as long as it is relevent and well 
argumented. All the relevent information regarding the contributions are 
detailled in the so called section.
+
+               \subsection{General Report}
+In its future state, SchemaFuzz will generate a synthesized report concerning 
the overall execution of the tool (which it does not do right now). This 
general report will primarely contain the most "intresting" mutations (meaning 
the mutations with the highest score mark) for the whole run.
+A more advanced version of this report would also take into account the code 
coverage rate for each mutation and execute a last clustering round at the end 
of the execution to generate a "global" score that would represent the global 
value of each mutations.
        
                \subsection{Code coverage}
 We are considering changing or simply adding code coverage in the clustering 
method as a parameters.Not only would this increase the accuracy of the scoring 
but also increase the accuracy of the "type" of each mutation. To this day, 
this tool does not make a concrete difference in terms of scoring or 
information generating (reports) beetween a mutation with a new stack trace in 
a very common code path and a very common stack trace in a very rarely 
triggered code path.
+
+               \subsection{Data type Pre-analyzing}
+This idea for this feature to be is to implement some kind of "auto learning" 
mechanism.
+To be more precise, this routine is meant to performed a statistical analysis 
on a representative portion database's content. This analysis would provide the 
rest of the program the commun values encountered for each field. More 
genericly, this would allow the software to have a global view over the format 
of the data that the database holds.
+Such global understanding of the content format is very intresting to make the 
modifications possibilites nore relevent. Indeed, one of the major limitation 
of SchemaFuzz is its "blindness".
+That is to say that some of the modifications performed in the course the 
execution of the program are irrelevent due to the lack of information on what 
is supposed to be stored in this precise field.
+For instance, a field that only holds numerical values that go from 1 to 1000 
even if it has enough bits to encode from -32767 to 32767 would have a very low 
chance of triggering a crash if this software modifies its value from 10 to 55.
+on the other end, if the software modifies this very same field from 10 to 
-12000, then a crash is much more likely to pop up.
+Same principle applies to strings. Suppose a field can encode 10 characters.
+the pre analysis, detected that, for this field, most of the value were 
surnames beginning with the letter "a". Changing this field from "Sylvain" to 
"Sylvaim" will probably not be very effective. However, changing this same 
field from "Sylvain" to "NULL" might indeed triggered an unexpected behavior. 
+  
+This pre-analysis routine would only be executed once at the start of the 
execution, right after the meta data exctraction. The result of this analysis 
will be held by a specific object. 
+this object's lifespan is equal to the duration of the main loop's execution 
(so that every mutation can benefits from the analysis data.)
+               
                \subsection{Centralised anonymous user data}
 SchemaFuzz's efficiency in thightly linked to the quality of its heuristics. 
this term includes the following points 
                \begin{itemize}
@@ -330,6 +310,9 @@ SchemaFuzz's efficiency in thightly linked to the quality 
of its heuristics. thi
                \item{Quantity of supported data types}
                \end{itemize}
 Knowing this, we are also concidering for futures enhancements an anonymous 
data collection  for each execution of this tool that will be statisticly 
computed to determine the best modification in average. This would improve the 
choosing mechanism by balancing the weights  depending on the modifcation's 
average quality. Modifications with higher average quality would see their 
weight increased (meaning they would get picked more frequently) and vice 
versa.                   
+
+\includepdf[pages=-]{PersonnalExperience.pdf}
+
        \section{Contributing}
 You can send your ideas at  \\*
                address@hidden
diff --git a/Documentation.tex b/docs/Documentation.tex
similarity index 83%
copy from Documentation.tex
copy to docs/Documentation.tex
index c184d64..23fafa1 100644
--- a/Documentation.tex
+++ b/docs/Documentation.tex
@@ -6,11 +6,12 @@
 \usepackage{pifont}
 \graphicspath{{/home/feideu/Work/Gnunet/schemafuzz/docs/}}
 \usepackage{graphicx}
+\usepackage{pdfpages}
 \usepackage{emp}
 \usetikzlibrary{shapes.arrows,chains}
-\usepackage[ngerman]{babel}
+\usepackage[english]{babel}
 
-\title{Documentation for schemaFuzz}
+\title{Documentation for SchemaFuzz}
 \author{Ulrich "Feideus" Erwan}
 
 \begin{document}
@@ -24,20 +25,40 @@
 SchemaFuzz is a free software command line tool incorporated inside the Gnu 
Taler package 
 which is a free software electronic payment system providing anonymity for 
customers.
 The main goal of this project is to provide an effecient debbuging tool that 
uses a "fuzzing" strategy oriented on databases.  
-Traditionnal fuzzing is defined as "testing an automated software testing 
technique that involves providing invalid, unexpected, or random data as inputs 
to a computer program". SchemaFuzz uses this principle and applies it to the 
database field.
 Where a traditionnal fuzzer would send malformed input to a program, 
SchemaFuzz modifies the content of a database to test that program's behavior 
when stumbling on such unexpected data. \\*
 Obviously, this tool is meant to be used as a mean of debugging as the goal is 
to pop buggs or put into light the security breaches that the code may contain 
regarding the retrieving, usage and saving of a database's content.
 As this tool is being developped as a master's thesis project, its current 
state is far from being finished and there are many options and optimisations 
that deserve to be implemented that are not yet available.
-These future/missing features will be detailed and discussed in a dedicated 
section.
 
        
        \section{Context and Perimeter}
-SchemaFuzz's developpement enrolls in the global dynamic of the past decades 
regarding internet  that sustain great efforts to make it a more fluid, 
pleasant but more importantly a safer space. This tool is meant to help 
developpers, mainteners and more genericly anyone that makes use of database 
comming from a database under his influence in their task. A good way to 
summerise the effect of this tool is to compare it with an "cyber attack 
simulator".
+SchemaFuzz's developpement enrolls in the global dynamic of the past decades 
regarding internet  that sustain great efforts to make it a more fluid, 
pleasant but more importantly a safer space.
+
+It uses the principle of "fuzz testing" or "fuzzing" to help find out which 
are the weak code paths of one's project. 
+                               \begin{quotation}
+Traditionnal fuzzing is defined as "testing an automated software testing 
technique that involves providing invalid, unexpected, or random data as inputs 
to a computer program".
+                               \end{quotation}         
+
+This illustation is very well illustated by the following example :
+                               \begin{quotation}
+                               Lets's consider an integer in a program, which 
stores the result of a user's choice between 3 questions. When the user picks 
one, the choice will be 0, 1 or 2. Which makes three practical cases. But what 
if we transmit 3, or 255 ? We can, because integers are stored a static size 
variable. If the default switch case hasn't been implemented securely, the 
program may crash and lead to "classical" security issues: (un)exploitable 
buffer overflows, DoS, ... 
+                               \end{quotation}
+
+It is declined in severals categories that each focus on a specific type of 
input.
+ 
+UI fuzzing focuses on button sequences and more genericly any kind of user 
input during the execution of a program. The above exemple falls into this 
category.
+This principle had already successfully been used in existing fuzzing tool 
such as the well known "American fuzzy loop".
+File format fuzzing generates multiple malformed samples, and opens them 
sequentially.
+However, SchemaFuzz is a database oriented fuzzer. This means that it focuses 
on triggering unexpected behavior related to the usage of a external database 
content   
+
+This tool is meant to help developpers, mainteners and more genericly anyone 
that makes use of database comming from a database under his influence in their 
task. A good way to summerise the effect of this tool is to compare it with an 
"cyber attack simulator".
 This means that the idea behind it is to emulate the damage that an attacker 
may cause subtly or not to a database he unlegitly gained privileges on. This 
might in theory go from a simple boolean flip (subtle modifications) to 
removing/adding content to purely and simply destroying or erasing all the 
content of the database.
 SchemaFuzz focuses on the first part : modification of the content of the 
database by single small modification that may or may not overlap. These 
modifications may be very aggressive of very subtle.
 It is intresting to point out that this last point also qualifies SchemaFuzz 
as a good "database structural flaw detector".
 That is to say that errors typically triggered by a poor management of a 
database (wrong data type usage, incoherence beetween database structure and 
use of the content etc ...) might also appear clearly during the execution.   
                \subsection{Perimeter}
+This tool is based on some of the SchemaSpy tool's source code. More 
precisely, it uses the portion of the code that detect and stores the target 
database's structure.
+The main goal of this project is to build on top of this piece of existing 
code the functionnalities required to test the usage of the database content by 
any kind of utility.                
+The resulting software will generate a group of human readable reports on each 
modification that was performed.                
                \begin{figure} [htbp]
                \centering
                \includegraphics[scale=1]{codeOriginDiagram.pdf}
@@ -46,71 +67,8 @@ That is to say that errors typically triggered by a poor 
management of a databas
                \subsection{When to use it}
 SchemaFuzz is a very usefull tool for anyone trying secure a piece of software 
that uses database ressources. The target software should be GDB compatible and 
the DBMS has to grant access to the target database through credentials passed 
as argument to this tool.
 
----It is very strongly advice to use a copy of the target databas erather than 
on the production material. Doing so will very likely result in the database 
being corrupted and not usable for any usefull mean.--
- 
-       \section{Usage}
-               \subsection{prerequisites}
-                       SchemaFuzz requires the presence of a list of libraries 
to work properly which are :
-                       \begin{itemize}
-                       \item org.apache.commons.math3 >= 3.6
-                       available at \\*
-                       
\url{https://commons.apache.org/proper/commons-math/download_math.cgi}          
        
-                       \end{itemize}
-The library has to be installed in the maven repository to be available. The 
instructions detailed at the following address explain how to do that. futher 
information can be found on the official maven website.\\*
-
-                       
\url{https://www.mkyong.com/maven/how-to-include-library-manully-into-maven-local-repository/}
-                       
-               \subsection{setting up the code}
-                       Once all the depencies have been installed 
successfully, clone the source available on the official git taler repository 
\\*
-                       \url{https://git.taler.net/schemafuzz.git}
-                       \begin{verbatim}
-                        git clone https://git.taler.net/schemafuzz.git
-                       \end{verbatim}
-                       
-the folder containing the code shoud hold the rights for reading writing and 
executing (rwx) for the user that plans to run the tool.
-if this is not the case, you can give these rights like so
-                       \begin{verbatim}
-                       sudo chmod -R 700 schemafuzz
-                       \end{verbatim}
-               \subsection{Building}
-SchemaFuzz is using maven for building and library management purposes.
-Therefore, using the maven command line building script is way to go.
-Standard way of building :\\*
-                       \begin{verbatim}
-                       ./mvnw package
-                       \end{verbatim}
-                               
-This maven building method also offers alternative instructions for    more 
precise/refined way of building as well as compilation and test 
-launching options (those should only be intresting for the contributors).
+---It is very strongly advice to use a copy of the target databas erather than 
on the production material. Doing so will very likely result in the database 
being corrupted and not usable for any usefull mean.
 
-Launching the test suit :\\*
-                       \begin{verbatim}
-                       ./mvnw test
-                       \end{verbatim}
-Compiling the code :\\*                
-                       \begin{verbatim}
-                       ./mvnw compile
-                       \end{verbatim}
-               
-Other usefull commands: \\*            
-               
-                       \begin{verbatim}
-                       ./mvnw clean
-                       \end{verbatim}
-                       \begin{verbatim}
-                       ./mvnw validate
-                       \end{verbatim}
-                       \begin{verbatim}
-                       ./mvnw deploy
-                       \end{verbatim}
-               
-               \subsection{Setting up the database}    
-       
-Launch the "dbConfigure" script.
-                       \begin{verbatim}
-                               ./dbConfigure
-                       \end{verbatim}           
-       
        \section{Design}
                \subsection{Generic explanation}
 SchemaFuzz implementation is based on some bits of the SchemaSpy project 
source code.
@@ -314,13 +272,34 @@ The following points constitute the main flaw of the 
source code:
                        \item Hard to maintain. The code is not optimised 
either in term of size or                     efficency. Bad coding habits tend 
to make it rather weak and unstable to context changes.
                        \item Structure is not intuitive. The main loop of the 
program lacks a good             structure.
                        \end{itemize}
-                       
+
+       \section{Results and exemples}
+In the proccess of being written.
+
        \section{Upcomming features and changes}
 This section will provide more insights on the future features that 
might/may/will be implemented as well as the changes in the existing code.
 Any sugestion will be greatly appriciated as long as it is relevent and well 
argumented. All the relevent information regarding the contributions are 
detailled in the so called section.
+
+               \subsection{General Report}
+In its future state, SchemaFuzz will generate a synthesized report concerning 
the overall execution of the tool (which it does not do right now). This 
general report will primarely contain the most "intresting" mutations (meaning 
the mutations with the highest score mark) for the whole run.
+A more advanced version of this report would also take into account the code 
coverage rate for each mutation and execute a last clustering round at the end 
of the execution to generate a "global" score that would represent the global 
value of each mutations.
        
                \subsection{Code coverage}
 We are considering changing or simply adding code coverage in the clustering 
method as a parameters.Not only would this increase the accuracy of the scoring 
but also increase the accuracy of the "type" of each mutation. To this day, 
this tool does not make a concrete difference in terms of scoring or 
information generating (reports) beetween a mutation with a new stack trace in 
a very common code path and a very common stack trace in a very rarely 
triggered code path.
+
+               \subsection{Data type Pre-analyzing}
+This idea for this feature to be is to implement some kind of "auto learning" 
mechanism.
+To be more precise, this routine is meant to performed a statistical analysis 
on a representative portion database's content. This analysis would provide the 
rest of the program the commun values encountered for each field. More 
genericly, this would allow the software to have a global view over the format 
of the data that the database holds.
+Such global understanding of the content format is very intresting to make the 
modifications possibilites nore relevent. Indeed, one of the major limitation 
of SchemaFuzz is its "blindness".
+That is to say that some of the modifications performed in the course the 
execution of the program are irrelevent due to the lack of information on what 
is supposed to be stored in this precise field.
+For instance, a field that only holds numerical values that go from 1 to 1000 
even if it has enough bits to encode from -32767 to 32767 would have a very low 
chance of triggering a crash if this software modifies its value from 10 to 55.
+on the other end, if the software modifies this very same field from 10 to 
-12000, then a crash is much more likely to pop up.
+Same principle applies to strings. Suppose a field can encode 10 characters.
+the pre analysis, detected that, for this field, most of the value were 
surnames beginning with the letter "a". Changing this field from "Sylvain" to 
"Sylvaim" will probably not be very effective. However, changing this same 
field from "Sylvain" to "NULL" might indeed triggered an unexpected behavior. 
+  
+This pre-analysis routine would only be executed once at the start of the 
execution, right after the meta data exctraction. The result of this analysis 
will be held by a specific object. 
+this object's lifespan is equal to the duration of the main loop's execution 
(so that every mutation can benefits from the analysis data.)
+               
                \subsection{Centralised anonymous user data}
 SchemaFuzz's efficiency in thightly linked to the quality of its heuristics. 
this term includes the following points 
                \begin{itemize}
@@ -330,10 +309,15 @@ SchemaFuzz's efficiency in thightly linked to the quality 
of its heuristics. thi
                \item{Quantity of supported data types}
                \end{itemize}
 Knowing this, we are also concidering for futures enhancements an anonymous 
data collection  for each execution of this tool that will be statisticly 
computed to determine the best modification in average. This would improve the 
choosing mechanism by balancing the weights  depending on the modifcation's 
average quality. Modifications with higher average quality would see their 
weight increased (meaning they would get picked more frequently) and vice 
versa.                   
+
+\includepdf[pages=-]{PersonnalExperience.pdf}
+
        \section{Contributing}
 You can send your ideas at  \\*
                address@hidden
 Or directly create a pull request on the official repository to edit this 
document and/or the code itself
-       \section{Conclusion}
+       
+       
+       
 \end{empfile}
 \end{document} 
diff --git a/docs/PersonnalExperience.tex b/docs/PersonnalExperience.tex
index f138a7b..01f8492 100644
--- a/docs/PersonnalExperience.tex
+++ b/docs/PersonnalExperience.tex
@@ -36,5 +36,64 @@ Other the other hand, it is a personnal reminder of what 
should be improved in m
        \end{itemize}           
 
        \subsection{Positive outcomes}
+Throughout the development of the project, I have had the chance to acquire 
many new capacities and improve many of my own skills. I will give more 
insights on what this project and, more genericly, what this intership as a 
developer for a GNU package, has brought me.
 
+               \subsubsection{Technical aspect}
+               
+               \paragraph{Java language}
+In many ways, this project has been a real challenge. But the main difficulty 
that I encountered was the technical challenge that rose up when the project 
started. Indeed, it was my first time conducting a project of the size of 
SchemaFuzz. The size of the project and the fact that I was the only one 
developing the tool implied that every aspect of the project, independently of 
the language that was used for each module, had to be imagined and implemented 
with my two hands.
+Even if I was already accostumed to Java programming, I got struck by the 
complexity and the architecture of a "real" in-production software like 
SchemaSpy which I had to look into to get the metadata extraction routine.
+This was my first improvement. Code structure. Even if my coding capacitites 
can still be perfected in many ways, I feel like understanding/re-using complex 
and well structured code gave me a much better idea of what "good code" really 
is. Integreting these concepts enpowered my development skills and I am now 
much more confident about it.
+
+Apart from the Java language, which I was already familiar with, I also had 
the chance to get my hands of new technologies (or technologies I never really 
had the chance to pratice in real conditions). 
+
+                       \paragraph{SQL language}
+SchemaFuzz is a database fuzzer. Naturally, A major component of the work for 
its development was to create and handle SQL requests and responses. In order 
to do that, I had to document myself for a while as I was lacking some 
knowledge on databases in general. After gaining a better understanding of how 
databases operate theoraticly, I had to go into more depth concerning the inner 
structure of constraints and the way datatypes are encoded for most DMBS.
+This brings me to my next point regarding the handling of SQL in this project.
+
+                       \paragraph{DBMS(PostgreSQL)} 
+SchemaFuzz's first and formost import goal is to help in the debugging and 
maintenance of the GNU Taler payement system. GNU Taler databases are managed 
by the PostgreSQL DBMS. Therefore, the natural choice of technology for SQL 
management in this project was obvious.
+Not having ever worked with PostgreSQL before, I had to adapt my habbits when 
dealing with the DBMS itself.
+By doing so, and stumbling on error messages I had never seen before, I had 
the chance to get into more depth in the structure of DBMSes in general. In 
particular, I had to get my hands on the inner PostgreSQL tables in order to 
understand how different databases were managed within the same environnement.
+
+                       \paragraph{Shell/Bash Scripting} 
+As a part of the development of the analyzer for SchemaFuzz, I have had the 
chance to build up several bash scripts. This excercice was to me a true 
pleasure as well as very instructive.
+Spending some time on writting parsing script had me look into how parsing is 
usually implemented for such jobs.
+Having this experience with me, I now better understand how each and every 
componenent of a same project connects to each other. 
+Even though I was aware of the power of scripting in general, I have now come 
to understand how much of a crucial skill it is to understand and be able to 
write scripts when working in a Linux environement.
+In the big picture, I feel like I have earned a precious asset by practicing 
scripting on a technical level. This also gave me the chance to develop my own 
script in the frame of personnal use in my own environnement. Going through 
more conceptual and theoratical documents on what scripting really is and how 
it should be used.
+                       
+                        \paragraph{LateX}
+By writting this documentation, I had to learn how to create and process 
properly presented and properly styled scientific documents. In this process, I 
have first learned and then practiced LateX as well as the very handy Tikz and 
metaUML packages used for graphical representations.
+Creating and implementing (in this case) graphics I did not concider to be a 
real coding challenge, but sone of them proved me terribly wrong. Spending time 
on finding the right synthax for what I wanted to show strenghend my project 
management skills and conforted me in the belief that presentation and creation 
of a project are two sides if the same coin and that both should be treated 
with the same amount of seriousness.                       
+       
+
+               \subsubsection{Human aspect}
+               
+                       \paragraph{Languages}
+The development of my project was conducted in german-speaking environnement, 
which is a language I am not very familiar with. 
+This lead to having any kind of communication both regarding the project and 
other subjects in english. This participated in my improvement in both oral and 
written english (this document is also an excellent training for written 
content) as well as my overall comprehension.
+Apart from the pure linguistic point of view, discussing complex topics in 
english gave me the keys to expressing ideas and concept in a more concise and 
clearer way.                         
+
+                       \paragraph{Political maturity}
+Disclaimer. With this paragraph, I am not pushing forward any idea in 
particular, all I intend to do is explain with more detail and insights on how 
rich the environnement was during this intership.                 
+                       
+Surprisingly, I have had the chance to meet many people that shared various 
political points of view regarding computer science and technologies. In these 
subjects it was a truly enriching process to debate things such as morality, 
ethic or freedom.
+Some other topics that are further away from science were brought up such as 
veganism, green energies, or anarchism.
+I hold very dearly the moments I shared speaking and confronting my own ideas 
because I feel like this has allowed me to gain maturity in my political 
positions.              
+
+       \section{Conclusion}
+   
+The development of SchemaFuzz and my work for GNU Taler was spread out on a 6 
months duration.
+Within this timelapse, I have discovered the fields of research and real 
software development.
+This discovery has been very beneficial to me in the sens that it gave me the 
chance to acquire experience both on the theoratical and technical sides as 
well as mastering some new technologies and new aspects in the field of 
computer science in general.
+
+My work for GNU Taler was primarily to imagine,conceptualize and develop a 
database oriented fuzzing tool. 
+First, I focused on bringing the software from a shape of "general idea" that 
was given to me by my internship supervisor to a concrete and structured 
project. In the process of creation, I started with defining what precise 
features were critical and with what technology they would be implemented.
+
+The main task of SchemaFuzz is to inject malformed data into a specific 
database in order to trigger crashes or unexpected behavior from the program 
that uses the content of this database.
+By working on this project for the past 6 months, I have brought it to a point 
where it fulfills its main task. I have uses a sample database contain content 
with a wide variaty in terms of data types to test the project all along the 
course of the development. However, the application is meant to evolve to a 
more advanced state. Such a big project requires much more time than what I had 
to be fully operationnal.
+
+Finally, I am convinced that the realisation of this project was a truly 
rewarding experience on all academical, technical and human aspects. All the 
knowledge acquired as GNU developer strenghened the concepts I had learned in 
my academical courses. Moreover, this internship is an excellent social 
experience thanks to the amount of contact with very bright professors, PhD 
students and other interns.     
+   
 \end{document}
\ No newline at end of file

-- 
To stop receiving notification emails like this one, please contact
address@hidden



reply via email to

[Prev in Thread] Current Thread [Next in Thread]