lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Reference member = shared_ptr


From: Greg Chicares
Subject: Re: [lmi] Reference member = shared_ptr
Date: Tue, 2 Feb 2021 14:48:49 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0

On 1/29/21 9:54 PM, Vadim Zeitlin wrote:
> On Fri, 29 Jan 2021 18:58:08 +0000 Greg Chicares <gchicares@sbcglobal.net> 
> wrote:
> 
> [organizing access to the cache]
> 
> GC> Strategically, I think the present way is fine: everything's
> GC> a shared_ptr.
> 
>  Sorry, I could well be missing something, but I still don't see any real
> need for shared pointers here. I.e. it seems to me that everything would
> work in exactly the same way if you kept unique pointers in the cache and
> just (const) references to them elsewhere. I'll assume the benefits of
> using unique, rather than shared, pointers are well-known and there is no
> need to re-enumerate them, so the question is: why do you think we need a
> shared pointer, when the ownership is not shared at all and, according to
> what you wrote, the object always belongs to the cache (which is its
> _unique_ owner)?

After spending some time looking for a better way, I'm coming to the
conclusion that Vaclav got everything right in his original:
  https://lists.nongnu.org/archive/html/lmi/2012-06/msg00007.html

The three changes I've considered are:

(1) In 'cache_file_reads.hpp', use a pointer to 'T const', not 'T':
-    using retrieved_type = std::shared_ptr<T>;
+    using retrieved_type = std::shared_ptr<T const>;
which is what Vaclav had done:
     typedef boost::shared_ptr<value_type const> value_ptr_type;
Thus: commit 6084765925 (which required prior commit 675f80e283).

That means (mostly) reverting a202f8754d3b9bf7 and instead spending
the time to follow the advice you had given:
  https://lists.nongnu.org/archive/html/lmi/2016-08/msg00006.html
| I think the way I'd do this would be to provide a way to test such
| what-if scenario in product_database itself by allowing to replace
| its "db_" member.
Thus: commit 057a41eac1b0.

(2) Hold cached data in client classes as a reference, instead
of holding the shared_ptr? No, that's wrong because

  https://lists.nongnu.org/archive/html/lmi/2012-06/msg00007.html
| shared_ptr<> instead of some other form of pointers because it neatly
| solves another problem: an entry in the cache could be invalidated by
| another access if the underlying file changed. But that shouldn't affect
| the instance already retrieved (nor should a pointer to a cached copy
| used somewhere suddenly become invalid) and shared pointers solve that
| neatly.

The problem, concretely, is that sophisticated users may modify
a (cached) file's contents, e.g., using the GUI product editor,
and then rerun an illustration with the modified product files,
which is a very useful capability. Therefore, we use
    using retrieved_type = std::shared_ptr<T const>;
    //        clients cannot modify contents ^^^^^
but not
    using retrieved_type = std::shared_ptr<T const> const; // No
    //        cache cannot refresh contents         ^^^^^  // No
The shared_ptr's lifetime extends to program termination, but
its contents may change while the program is running. Because
it's a shared_ptr, old data (preceding a file modification)
persists forever too, and is never invalidated as long as any
client retains a pointer to it. If a client class took the
shared_ptr as a ctor argument and held its dereferenced contents
in a 'const&' member, that reference could become dangling later;
instead, it needs to hold the shared_ptr as such, in order to
avoid decrementing its reference count.

(3) Use unique_ptr instead of shared_ptr...but that won't work
because the lifetime of the pointer itself is effectively forever,
but not the lifetime of the data it points to.

> GC> Tactically, however, I'm not so sure about all the variations:
> GC> 
> GC>  - class BasicValues holds a
> GC>     std::shared_ptr<product_data>       product_;
> GC> 
> GC> which seems fine (what could be more natural?);
> 
>  Using "product_data const&" would be more natural, of course.

It would be, in this case:

    using retrieved_type = std::shared_ptr<T const> const; // No
    //        cache cannot refresh contents         ^^^^^  // No

but that is not the case--i.e., I misled you here:

https://lists.nongnu.org/archive/html/lmi/2021-01/msg00031.html
| These really are intended to be global variables--initialized on
| demand, and persisting until program termination

because I had forgotten that cached file contents are refreshed
after modification.

> GC>  - product_database::initialize() has a local const reference:
> GC>     product_data const& p(*product_data::read_via_cache(f));
> GC> 
> GC> which seems fine because it's local (beware of the rather
> GC> ill-chosen class names: product_database != product_data);

The lifetime of that local 'p' is limited to a short code block:
its dereferenced value is used only in the immediately following
line:
    product_data const& p(*product_data::read_via_cache(f));
    std::string const filename(p.datum("DatabaseFilename"));
so this case is okay. Holding the pointer instead wouldn't make
a difference: it wouldn't be held long enough to matter.

> GC>  - class product_verifier has a const-reference data member:
> GC>     product_data     const& p_;
> GC> 
> GC> which seems...well, I dunno. I hesitated to do that, but
> GC> then I reasoned that this refers to cached data that's never
> GC> discarded, so it seems okay. But if we decide that's really
      ^^^^^^^^^ unless it's refreshed when the file is modified!
> GC> okay, then should we hold a 'product_data const&' in the
> GC> first example as well (class BasicValues), for uniformity?
> 
>  Yes, absolutely. And not only for uniformity, but to remove the need to
> have a shared_ptr<> in the first place, which is a bigger gain.

That's okay because class product_verifier exists only to verify
product files, and it would make no sense to modify a product
while verifying it. I.e., although this is a reference member
of class product_verifier:

    product_data     const& p_           ;

any instance of that class lives only for a fraction of a
second, so in effect it's local.

> GC> Or is initializing a reference member from a shared_ptr
> GC> anathema?
> 
>  Why would it be? Using a dangling reference would be anathema,

Aye, there's the rub. Thus the second sentence here:

 /// Instances are retrieved as shared_ptr<T const> so that they remain
 /// valid even when the file changes. The client is responsible for
 /// updating any stale pointers it holds, if it chooses to refresh the
 /// data; if it doesn't, then it uses older data, which remain valid
 /// as long as it holds a pointer to them.

What happens if the client doesn't choose to update stale data?
 - with shared_ptr: it just uses the old data (okay!)
 - with unique_ptr, or for data held as some_class const&: kaboom!

> and one way
> of avoiding it is to use shared pointers everywhere and never use
> references (or raw pointers, even non-owning ones) at all.

That's Vaclav's design.

> Another, and
> IMHO better, way of avoiding it is to clearly define the ownership and
> lifetime rules, as you did, by specifying that the objects are created and
                  ^^^^^^^^^^ but I erred
> owned by the cache and live until the program termination, and follow them.
> 
>  I.e. either we can rely on the objects having well-defined lifetime and
> then we don't need shared pointers at all. Or we can't rely on this, e.g.
> because the cache could be purged at any moment, and then we do need shared
> pointers and must not use (non-local) references anywhere. But IMO it
> doesn't make much sense to do something in the middle between the two
> approaches.

I think everything is reasonable now, as just pushed.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]