Age | Commit message (Collapse) | Author |
|
|
|
This improves the performance of Mesa's GL_MAP_UNSYNCHRONIZED_BIT path
in GL_ARB_map_buffer_range. Improves Unigine Tropics performance at
1024x768 by 2.30482% +/- 0.0492146% (n=61)
v2: Fix comment grammar.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
drmIoctl returns -1 on error with errno set to the error value. Other
users of it in this file just check for != 0, and only use errno when
they need to send an error value on to the caller of the API.
|
|
We've been hacking these constantly.
|
|
This will allow the driver to capture all of its execution state to a
file for later debugging. intel_gpu_dump is limited in that it only
captures batchbuffers, and Mesa's captures, while more complete, still
capture only a portion of the state involved in execution.
This is a squash commit of a long series of hacking as we tried to get
the resulting traces to work in the internal simulator. It contains
contributions by Yuanhan Liu and Kenneth Graunke.
v2: Drop the MI_FLUSH_ENABLE setup.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
For example:
export INTEL_DEVID_OVERRIDE=0x162
If this variable is set, don't actually submit the batchbuffer to the
GPU, it probably contains commands for the wrong generation of hardware.
v2: Introduce a getter for the overridden devid, and avoid getenv per exec.
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
xf86drmMode.h is missing a header protection. xf86drm.h has one so just
copy it and adjust the name.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
|
|
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
|
|
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Matt Turner <mattst88@gmail.com>
|
|
This one doesn't have the 3DSTATE_HIER_DEPTH_BUFFER bug that the
previous one did.
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Note that the regression test complains here: The batch that was
captured included a bug in its packet output, which was later fixed in
Mesa.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
This requires pulling the gen6 3DSTATE_WM out to a function so it
doesn't override gen7's handler.
v2: Fix pasteo in interpreting ZW interpolation (thanks danvet!).
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Every access to either the GTT or CPU pointer is supposed to be
proceeded by a set_domain ioctl so that GEM is able to manage the cache
domains correctly and for the following access to be coherent. Of
course, some people explicitly want incoherent, non-blocking access
which is going to trigger warnings by this patch but are probably better
served by explicit suppression.
v2: Also mark the pointers as inaccessible following the explicit unmap
and implicit unmap upon return to the cache.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
|
|
The empty string used for the not case is replaced by the default
if-else clause and so causes the configure to fail in the absence of
valgrind. Which is not quite what was intended.
Instead use the common idiom of setting a variable depending on whether
the true or false branch is taken and emit the conditional code as a
second step.
Reported-by: Tobias Jakobi <liquid.acid@gmx.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
In particular, declare the hidden CPU mmaps to valgrind so that it knows
about those memory regions.
v2: Add an additional VG_CLEAR for the getparam
References: https://bugs.freedesktop.org/show_bug.cgi?id=35071
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
[anholt: Ideally valgrind should just learn about the ioctls, and
removing the clear for the non-valgrindified code feels risky.]
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
Only account for the write domain in that case.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=43893 .
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
|
|
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
|
|
The mipmap level computation was wrong, we need to know the block
width, height, depth of compressed texture to properly compute this.
Change API to provide block width, height, depth instead of nblk_x,
nblk_y, nblk_z.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
|
|
We need to force 1D tiling only on old kernel the fallback was
broken along the way.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
|
|
work. sizeof() treats such parameters as pointers.
Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
|
|
Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
|
|
drmModeGetPlaneResources() and drmModeGetPlane() leaked in one error
path.
Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
|
|
The surface allocator is able to build complete miptree when allocating
surface for r600/r700/evergreen/northern islands GPU family. It also
compute bo size and alignment for render buffer, depth buffer and
scanout buffer.
v2 fix r6xx/r7xx 2D tiling width align computation
v3 add tile split support and fix 1d texture alignment
v4 rework to more properly support compressed format, split surface pixel
size and surface element size in separate fields
v5 support texture array (still issue on r6xx)
v6 split surface value computation and mipmap tree building, rework eg
and newer computation
v7 add a check for tile split and 2d tiled
v8 initialize mode value before testing it in all case, reenable
2D macro tile mode on r6xx for cubemap and array. Fix cubemap
to force array size to the number of face.
v9 fix handling of stencil buffer on evergreen
v10 on evergreen depth buffer need to have enough room for a stencil
buffer just after depth one
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
|
|
This adds support for querying the kernel about the LLC support in the
hardware.
In case the ioctl fails, we assume that it is present on GEN6 and GEN7.
v2: fix the return code checking
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
|
|
Commit efd6e81e inadvertently broke the build by looking for "i?86" or
"x86_64" in $host_os. The correct variable to check is $host_cpu.
This was preventing libdrm_intel.so from being built.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
|
|
This fixes a failure in 'make check' found by the tinderbox when trying to
build this code on Linux/ppc. This code is only designed to run on
Intel platforms, so don't even bother building it if we're not in that set.
Found-by: Tinderbox
Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
|
|
If the pci_device's actual gen was > 4, then we stupidly set
bufmgr_gem->gen = 6. Luckily this caused no bugs, and this fix shouldn't
change any behavior, because all checks against the gen currently have one
of the forms below:
gen == 2
gen == 3
gen >= 4
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
|
|
This just gets packet name and length in place, with the remainder
unfinished. I've long since finished the work that got me started
fixing up the decode.
|
|
|
|
Since CC_STATE_POINTERS for gen6 and 7 are quite different but use the
same opcode, move gen6 out to a helper function too, so we can use a
helper function for gen7.
|
|
|
|
This puts the error message in a consistent location relative to the
packet, and while I'm here I made the error message a bit more
informative.
Now, most static length packets need to just declare their length in
the table and not worry.
|
|
While I'm touching every line of the table, sort it by opcode.
|
|
I want to add packets, without contributing to the switch statement of
doom.
|
|
|
|
It's a lot nicer than using IS_WHATEVER(devid) all over the place, and
we have this in our other projects too.
|
|
The overflow checks were all thoroughly untested, and a bunch of the
ones I'm deleting were pretty broken. Now, in the case of overflow,
you just decode data of 0xd0d0d0d0, and instr_out prints the warning
message instead. Note that this still has the same issue of being
under-tested, but at least it's one place instead of per-packet.
A couple of BUFFER_FAIL uses are left where the length to be decoded
could be (significantly) larger than a page, and the decode didn't
just call instr_out (which doesn't dereference data itself unless it's
safe).
|
|
This reduces some of the extra derefs of the pointers.
|
|
Similar to BR00, count was always 1 and was always an index, not a count.
|
|
The count (actually index) was always 0, because BR00 is dword 0.
|
|
We still deref the context at the start of every call, but that will
change next.
|
|
Nothing was consuming it. If something wants this in the future,
would be done using the decode context anyway.
|
|
This is the start of plumbing the context through the decode
callchain instead of the current 4 arguments.
|
|
|
|
|
|
|
|
|