emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, a


From: Liu Hui
Subject: Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
Date: Thu, 17 Aug 2023 13:35:45 +0800

Hi,

Thank you for the patch!

> Next, for numpy arrays and pandas dataframes/series: these are
> converted to tables, for example:
>
> #+begin_src python
>   import pandas as pd
>   import numpy as np
>
>   return pd.DataFrame(np.array([[1,2,3],[4,5,6]]),
>                       columns=['a','b','c'])
> #+end_src
>
> #+RESULTS:
> |   | a | b | c |
> |---+---+---+---|
> | 0 | 1 | 2 | 3 |
> | 1 | 4 | 5 | 6 |
>
> To avoid conversion, you can specify "raw", "verbatim", "scalar", or
> "output" in the ":results" header argument.

Do we need to limit the table/list size by default, or handle them
only with relevant result type (e.g. `table/list')? Dataframe/array
are often large. The following results are truncated by default
previously, which can be tweaked via np.set_printoptions and
pd.set_option.

#+begin_src python
import numpy as np
return np.random.randint(10, size=(30,40))
#+end_src

#+begin_src python
import numpy as np
return np.random.rand(20,3,4,5)
#+end_src

#+begin_src python
import pandas as pd
import numpy as np

d = {'col1': np.random.rand(100), 'col2': np.random.rand(100)}
return pd.DataFrame(d)
#+end_src

> +def __org_babel_python_format_value(result, result_file, result_params):
> +    with open(result_file, 'w') as f:
> +        if 'graphics' in result_params:
> +            result.savefig(result_file)
> +        elif 'pp' in result_params:
> +            import pprint
> +            f.write(pprint.pformat(result))
> +        else:
> +            if not set(result_params).intersection(\
> +['scalar', 'verbatim', 'raw']):
> +                try:
> +                    import pandas
> +                except ImportError:
> +                    pass
> +                else:
> +                    if isinstance(result, pandas.DataFrame):
> +                        result = [[''] + list(result.columns), None] + \

Here we can use '{}'.format(df.index.name) to show the name of index

>  (defun org-babel-python-format-session-value
>      (src-file result-file result-params)
>    "Return Python code to evaluate SRC-FILE and write result to RESULT-FILE."
> -  (format "\
> +  (concat org-babel-python--def-format-value
> +      (format "

Maybe `org-babel-python--def-format-value' can be evaluated only once
in the session mode? It would shorten the string sent to the python
shell, where temp files are used for long strings.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]