Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generating Rust bindings which include a type named String doesn't compile #1371

Open
DRvader opened this issue Apr 27, 2024 · 8 comments
Open

Comments

@DRvader
Copy link

DRvader commented Apr 27, 2024

Describe the bug
When trying to generate bindings to a type with the name String I get the error

Error:   × cxx couldn't handle our generated bindings - could be a bug in autocxx:
    │ unsupported unique_ptr target type

In my particular case the String type is being incidentally generated because it is being returned from another function. But it will reproduce when directly generating the type named String as well.

To Reproduce
I created a minimized example which reproduces this behaviour with a header named test.h

#pragma once

#include <memory>
#include <stdint.h>

namespace tester {
	struct String {
		char *data;
		uint32_t len;
	};
}

class Indirect {
public:
	tester::String data;

	std::unique_ptr<tester::String> datum() {
		return std::make_unique<tester::String>(data);
	}
};
use autocxx::prelude::*;

include_cpp! {
    #include "test.h"
    safety!(unsafe)
    generate!("Indirect")
}

fn main() {}

Additional notes
I did some initial digging and found that the cxxbridge that autocxx is generating contains

impl UniquePtr<String> {}
impl SharedPtr<String> {}
impl WeakPtr<String> {}
impl CxxVector<String> {}

without the corresponding type String = super::bindgen::root::tester::String; declaration (which appears later). If these lines are removed then compilation will fail with cxx0.h containing a function conversion which attempts to return a rust::String instead of tester::String.

I am happy to continue to explore/fix this problem, but I would like some guidance on where I should look to find the String -> rust::String conversion machinery.

@adetaylor
Copy link
Collaborator

Please file a test case PR following these instructions.

Unfortunately all the stuff around names in autocxx is very fragile so this might not be an easy fix.

nujhong added a commit to nujhong/autocxx that referenced this issue Oct 8, 2024
nujhong added a commit to nujhong/autocxx that referenced this issue Oct 8, 2024
@nujhong
Copy link

nujhong commented Oct 8, 2024

Added a test in #1402

I was able to create a simpler reproducible test case, I believe it's just name conflicts with the default prelude, using a different struct name other than String should generate bindings without problems

@b0o
Copy link

b0o commented Jan 1, 2025

Is there a workaround for this if it's not possible to rename the String class we're binding to?

@b0o
Copy link

b0o commented Jan 27, 2025

This is severely blocking progress for me and I haven't found a great workaround. I was able to write manual bindings for the String class in question, but I'm still having issues binding to classes which take or return Strings as arguments or return types.

I'm wondering if making it possible to specify an alternate name in the generate! macro call might be a workaround for cases like this, e.g. generate!("myns::String", "MyString"). Then, in the generated cxx::bridge, the original name could be specified with #[cxx_name = "String"].

@adetaylor
Copy link
Collaborator

I'd accept a PR to fix this. To be honest just having a PR which I can use to reproduce the problem in a test case might be enough for me to take a look. #1402 is the right sort of thing but I can't merge it due to the contributor not signing the (admittedly annoying) Google CLA.

There's a 50/50 chance that this is an easy fix, versus a bit of a nightmare which requires revisiting all the fragile naming stuff that I've been failing to rework for years.

b0o added a commit to b0o/autocxx that referenced this issue Jan 27, 2025
Adds one passing test and one currently failing test, the only difference being
the name of the class.
b0o added a commit to b0o/autocxx that referenced this issue Jan 27, 2025
Adds one passing test and one currently failing test, the only difference being
the name of the class.
b0o added a commit to b0o/autocxx that referenced this issue Jan 27, 2025
@b0o
Copy link

b0o commented Jan 27, 2025

I've actually just come up with a decent workaround by combining the following: manual bindings to my myns::String class using cxx, blocking autocxx from generating bindings with block!("myns::String"), and for some reason I needed to manually implement cxx::ExternType for one of my other autocxx-generated classes.

Now I'm able to generate other classes which have a few methods that accept/return myns::String. The other methods which accept/return "innocent" types are autogenerated properly, and I can manually bind to the myns::String-related methods using the cpp crate.

So, this is no longer urgent for me, although it would be nice it were fixed.

P.s. thank you Adrian for all of your hard work on this crate, in most cases it works spectacularly well and has saved me a ton of work!

@adetaylor
Copy link
Collaborator

Thanks - I'm glad you found a workaround.

As I mentioned, the naming stuff is annoyingly fragile, and unfortunately I think providing custom names for stuff falls into the category of "stuff I'm not quite brave enough to touch" until more work is done to clean it up - specifically a bunch of newtype wrappers for the different kinds of name we hold, which is roughly the thing tracked in #520.

adetaylor added a commit that referenced this issue Jan 29, 2025
This is an internal change which begins to dig us out of the mess of naming we
have inside autocxx. We deal with all sorts of different kinds of identifiers -
for Rust, for cxx, for C++, from bindgen - and convert between them in a fairly
ad-hoc basis. It's all too fragile to be able to tackle changes like #1371.

Fortunately, Rust makes it easy to solve these sorts of messes using newtype
wrappers. Unfortunately, I've tried that in the past and it's got to be a
huge muddle.

Here's another attempt at carving off part of the space. This introduces
such wrappers for two types of name:
* The [bindgen_original_name] identifiers discovered in bindgen
  attributes;
* The final name to be used for C++ function calls.

There are various FIXMEs added in locations where we're converting to or from
other types of name in questionable ways.

A bit of #520.
adetaylor added a commit that referenced this issue Jan 30, 2025
This is an internal change which begins to dig us out of the mess of naming we
have inside autocxx. We deal with all sorts of different kinds of identifiers -
for Rust, for cxx, for C++, from bindgen - and convert between them in a fairly
ad-hoc basis. It's all too fragile to be able to tackle changes like #1371.

Fortunately, Rust makes it easy to solve these sorts of messes using newtype
wrappers. Unfortunately, I've tried that in the past and it's got to be a
huge muddle.

Here's another attempt at carving off part of the space. This introduces
such wrappers for two types of name:
* The [bindgen_original_name] identifiers discovered in bindgen
  attributes;
* The final name to be used for C++ function calls.

There are various FIXMEs added in locations where we're converting to or from
other types of name in questionable ways.

A bit of #520.
@adetaylor
Copy link
Collaborator

I've now dug into #520 enough to refresh my memory of the naming stuff, and what it would take to fix this.

I'm not likely to tackle this myself, but here's what it would take. Each of these steps can be a separate PR. This would improve the maintainability of the codebase so I'm in favour of it; it's just not near the top of my priority list for the limited time I can personally spend on autocxx.

  1. A lot of ground work I already did in a couple of PRs on Abstract naming stuff #520.
  2. Each Api inside the autocxx engine has a QualifiedName associated. This is used for two things: first, to contain the actual name of the struct/function/etc. which is passed to cxx, and used in generated Rust code, and used in generated C++ code. Secondly, it's used to indicate dependency relationships between APIs (e.g. function X has a parameter of type Y). We need to split those usages, because fixing this bug is going to require renaming structs. The first step is to cease to use the name when doing codegen for functions. Any usages of the API name for Api::Function within codegen_rs or codegen_cpp need to be replaced with new fields on FnAnalysis.
  3. Similarly, we shouldn't rely on the existing name of any other type of API during codegen. Replace the QualifiedName with new fields which indicate the name to be used during codegen. At this point, take the opportunity to introduce new newtype wrappers for the different kinds of name; e.g. a CxxBridgeName type for any name that is to be used in the cxx::bridge.
  4. Retrofit those new newtype wrappers into the FnAnalysis.
  5. We can now replace the QualifiedName with a numeric ID, to track the dependency relationships between APIs.
  6. We can now set about renaming APIs to avoid conflicts. There's already a lot of code like this for functions - due to overloads. The function analysis stage already separately tracks the cxx::bridge name relative to the original C++ name and Rust name. We need to extract all that so that it applies to all types of API. We should ideally do this on the creation of any CxxBridgeName so that the existence of such a name guarantees uniqueness. Most APIs will need a separate CxxBridgeName, RustName and CppName. It might be desirable to lump them together into a new NameInfo structure which can be repeated across all APIs. This is a little bit like the old QualifiedName. As far as I can tell, only CxxBridgeName will need to be uniquified, because that's the only space of names which is flat. Rust names and C++ names are already in hierarchic structures of namespaces and we can reasonably assume that the C++ code, and bindgen, have respectively ensured uniqueness there.
  7. Creation of a CxxBridgeName should also pass the checks in validate_ident_ok_for_cxx, which should be expanded to reject things like String.
  8. We may need to introduce a new analysis phase to do some renaming. For example if a function B depends on a parameter of type A, and we've renamed A, we will need to rewrite the signature of function B within the cxx::bridge. However I'm moderately hopeful that all uniquifiication of CxxBridgeNames will happen early enough that this will be taken care of by small changes within TypeConverter.

This is all pretty major surgery but I have a reasonable amount of faith that the hundreds of integration tests will catch most of the corner cases, so it should be achievable.

Meanwhile it's quite possible that I'll raise a PR to simply reject APIs called String etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants